diff mbox series

string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled

Message ID 20250308023314.3981455-1-pcc@google.com (mailing list archive)
State New
Headers show
Series string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled | expand

Commit Message

Peter Collingbourne March 8, 2025, 2:33 a.m. UTC
The optimized strscpy() and dentry_string_cmp() routines will read 8
unaligned bytes at a time via the function read_word_at_a_time(), but
this is incompatible with MTE which will fault on a partially invalid
read. The attributes on read_word_at_a_time() that disable KASAN are
invisible to the CPU so they have no effect on MTE. Let's fix the
bug for now by disabling the optimizations if the kernel is built
with HW tag-based KASAN and consider improvements for followup changes.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548
Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
Cc: stable@vger.kernel.org
---
 fs/dcache.c  | 2 +-
 lib/string.c | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

Comments

Kees Cook March 8, 2025, 3:36 a.m. UTC | #1
On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> The optimized strscpy() and dentry_string_cmp() routines will read 8
> unaligned bytes at a time via the function read_word_at_a_time(), but
> this is incompatible with MTE which will fault on a partially invalid
> read. The attributes on read_word_at_a_time() that disable KASAN are
> invisible to the CPU so they have no effect on MTE. Let's fix the
> bug for now by disabling the optimizations if the kernel is built
> with HW tag-based KASAN and consider improvements for followup changes.

Why is faulting on a partially invalid read a problem? It's still
invalid, so ... it should fault, yes? What am I missing?

> 
> Signed-off-by: Peter Collingbourne <pcc@google.com>
> Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548
> Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
> Cc: stable@vger.kernel.org
> ---
>  fs/dcache.c  | 2 +-
>  lib/string.c | 3 ++-
>  2 files changed, 3 insertions(+), 2 deletions(-)

Why are DCACHE_WORD_ACCESS and HAVE_EFFICIENT_UNALIGNED_ACCESS separate
things? I can see at least one place where it's directly tied:

arch/arm/Kconfig:58:    select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS

Would it make sense to sort this out so that KASAN_HW_TAGS can be taken
into account at the Kconfig level instead?

> diff --git a/fs/dcache.c b/fs/dcache.c
> index e3634916ffb93..71f0830ac5e69 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -223,7 +223,7 @@ fs_initcall(init_fs_dcache_sysctls);
>   * Compare 2 name strings, return 0 if they match, otherwise non-zero.
>   * The strings are both count bytes long, and count is non-zero.
>   */
> -#ifdef CONFIG_DCACHE_WORD_ACCESS
> +#if defined(CONFIG_DCACHE_WORD_ACCESS) && !defined(CONFIG_KASAN_HW_TAGS)

Why not also the word_at_a_time use in fs/namei.c and lib/siphash.c?

For reference, here are the DCACHE_WORD_ACCESS places:

arch/arm/Kconfig:58:    select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS
arch/arm64/Kconfig:137: select DCACHE_WORD_ACCESS
arch/powerpc/Kconfig:192:       select DCACHE_WORD_ACCESS if PPC64 && CPU_LITTLE_ENDIAN
arch/riscv/Kconfig:934: select DCACHE_WORD_ACCESS if MMU
arch/s390/Kconfig:154:  select DCACHE_WORD_ACCESS if !KMSAN
arch/x86/Kconfig:160:   select DCACHE_WORD_ACCESS if !KMSAN
arch/x86/um/Kconfig:12: select DCACHE_WORD_ACCESS

>  
>  #include <asm/word-at-a-time.h>
>  /*
> diff --git a/lib/string.c b/lib/string.c
> index eb4486ed40d25..9a43a3824d0d7 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -119,7 +119,8 @@ ssize_t sized_strscpy(char *dest, const char *src, size_t count)
>  	if (count == 0 || WARN_ON_ONCE(count > INT_MAX))
>  		return -E2BIG;
>  
> -#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && \
> +	!defined(CONFIG_KASAN_HW_TAGS)

There are lots more places checking CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS...
Why only here?

And the Kconfigs since I was comparing these against DCACHE_WORD_ACCESS

arch/arc/Kconfig:352:   select HAVE_EFFICIENT_UNALIGNED_ACCESS
arch/arm/Kconfig:107:   select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU
arch/arm64/Kconfig:222: select HAVE_EFFICIENT_UNALIGNED_ACCESS
arch/loongarch/Kconfig:140:     select HAVE_EFFICIENT_UNALIGNED_ACCESS if !ARCH_STRICT_ALIGN
arch/m68k/Kconfig:33:   select HAVE_EFFICIENT_UNALIGNED_ACCESS if !CPU_HAS_NO_UNALIGNED
arch/powerpc/Kconfig:246:       select HAVE_EFFICIENT_UNALIGNED_ACCESS
arch/riscv/Kconfig:935: select HAVE_EFFICIENT_UNALIGNED_ACCESS
arch/s390/Kconfig:197:  select HAVE_EFFICIENT_UNALIGNED_ACCESS
arch/x86/Kconfig:238:   select HAVE_EFFICIENT_UNALIGNED_ACCESS
arch/x86/um/Kconfig:13: select HAVE_EFFICIENT_UNALIGNED_ACCESS

>  	/*
>  	 * If src is unaligned, don't cross a page boundary,
>  	 * since we don't know if the next page is mapped.
> -- 
> 2.49.0.rc0.332.g42c0ae87b1-goog
> 

-Kees
Catalin Marinas March 10, 2025, 5:29 p.m. UTC | #2
On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> The optimized strscpy() and dentry_string_cmp() routines will read 8
> unaligned bytes at a time via the function read_word_at_a_time(), but
> this is incompatible with MTE which will fault on a partially invalid
> read. The attributes on read_word_at_a_time() that disable KASAN are
> invisible to the CPU so they have no effect on MTE. Let's fix the
> bug for now by disabling the optimizations if the kernel is built
> with HW tag-based KASAN and consider improvements for followup changes.
> 
> Signed-off-by: Peter Collingbourne <pcc@google.com>
> Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548
> Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
> Cc: stable@vger.kernel.org

Some time ago Vincenzo had an attempt at fixing this but neither of us
got around to posting it. It's on top of 6.2 and not sure how cleanly it
would rebase:

git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux devel/mte-strscpy

Feel free to cherry-pick patches from above, rewrite them etc.

> diff --git a/lib/string.c b/lib/string.c
> index eb4486ed40d25..9a43a3824d0d7 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -119,7 +119,8 @@ ssize_t sized_strscpy(char *dest, const char *src, size_t count)
>  	if (count == 0 || WARN_ON_ONCE(count > INT_MAX))
>  		return -E2BIG;
>  
> -#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && \
> +	!defined(CONFIG_KASAN_HW_TAGS)

Assuming that no-one wants to ever use KASAN_HW_TAGS=y in production,
this patch would do. Otherwise I'd rather use TCO around the access as
per the last patch from Vincenzo above.

Yet another option - use load_unaligned_zeropad() instead of
read_word_at_a_time(), not sure how it changes the semantics of
strscpy() in any way. This can be done in the arch code
Catalin Marinas March 10, 2025, 5:37 p.m. UTC | #3
On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote:
> On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> > The optimized strscpy() and dentry_string_cmp() routines will read 8
> > unaligned bytes at a time via the function read_word_at_a_time(), but
> > this is incompatible with MTE which will fault on a partially invalid
> > read. The attributes on read_word_at_a_time() that disable KASAN are
> > invisible to the CPU so they have no effect on MTE. Let's fix the
> > bug for now by disabling the optimizations if the kernel is built
> > with HW tag-based KASAN and consider improvements for followup changes.
> 
> Why is faulting on a partially invalid read a problem? It's still
> invalid, so ... it should fault, yes? What am I missing?

read_word_at_a_time() is used to read 8 bytes, potentially unaligned and
beyond the end of string. The has_zero() function is then used to check
where the string ends. For this uses, I think we can go with
load_unaligned_zeropad() which handles a potential fault and pads the
rest with zeroes.

> > Signed-off-by: Peter Collingbourne <pcc@google.com>
> > Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548
> > Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
> > Cc: stable@vger.kernel.org
> > ---
> >  fs/dcache.c  | 2 +-
> >  lib/string.c | 3 ++-
> >  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> Why are DCACHE_WORD_ACCESS and HAVE_EFFICIENT_UNALIGNED_ACCESS separate
> things? I can see at least one place where it's directly tied:
> 
> arch/arm/Kconfig:58:    select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS

DCACHE_WORD_ACCESS requires load_unaligned_zeropad() which handles the
faults. For some reason, read_word_at_a_time() doesn't expect to fault
and it is only used with HAVE_EFFICIENT_UNALIGNED_ACCESS. I guess arm32
only enabled load_unaligned_zeropad() on hardware that supports
efficient unaligned accesses (v6 onwards), hence the dependency.

> Would it make sense to sort this out so that KASAN_HW_TAGS can be taken
> into account at the Kconfig level instead?

I don't think we should play with config options but rather sort out the
fault path (load_unaligned_zeropad) or disable MTE temporarily. I'd go
with the former as long as read_word_at_a_time() is only used for
strings in conjunction with has_zero(). I haven't checked.
Kees Cook March 10, 2025, 6:09 p.m. UTC | #4
On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote:
> On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote:
> > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> > > The optimized strscpy() and dentry_string_cmp() routines will read 8
> > > unaligned bytes at a time via the function read_word_at_a_time(), but
> > > this is incompatible with MTE which will fault on a partially invalid
> > > read. The attributes on read_word_at_a_time() that disable KASAN are
> > > invisible to the CPU so they have no effect on MTE. Let's fix the
> > > bug for now by disabling the optimizations if the kernel is built
> > > with HW tag-based KASAN and consider improvements for followup changes.
> > 
> > Why is faulting on a partially invalid read a problem? It's still
> > invalid, so ... it should fault, yes? What am I missing?
> 
> read_word_at_a_time() is used to read 8 bytes, potentially unaligned and
> beyond the end of string. The has_zero() function is then used to check
> where the string ends. For this uses, I think we can go with
> load_unaligned_zeropad() which handles a potential fault and pads the
> rest with zeroes.

Agh, right, I keep forgetting that this can read past the end of the
actual allocation. I'd agree, load_unaligned_zeropad() makes sense
there.

> 
> > > Signed-off-by: Peter Collingbourne <pcc@google.com>
> > > Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548
> > > Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
> > > Cc: stable@vger.kernel.org
> > > ---
> > >  fs/dcache.c  | 2 +-
> > >  lib/string.c | 3 ++-
> > >  2 files changed, 3 insertions(+), 2 deletions(-)
> > 
> > Why are DCACHE_WORD_ACCESS and HAVE_EFFICIENT_UNALIGNED_ACCESS separate
> > things? I can see at least one place where it's directly tied:
> > 
> > arch/arm/Kconfig:58:    select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS
> 
> DCACHE_WORD_ACCESS requires load_unaligned_zeropad() which handles the
> faults. For some reason, read_word_at_a_time() doesn't expect to fault
> and it is only used with HAVE_EFFICIENT_UNALIGNED_ACCESS. I guess arm32
> only enabled load_unaligned_zeropad() on hardware that supports
> efficient unaligned accesses (v6 onwards), hence the dependency.
> 
> > Would it make sense to sort this out so that KASAN_HW_TAGS can be taken
> > into account at the Kconfig level instead?
> 
> I don't think we should play with config options but rather sort out the
> fault path (load_unaligned_zeropad) or disable MTE temporarily. I'd go
> with the former as long as read_word_at_a_time() is only used for
> strings in conjunction with has_zero(). I haven't checked.

Okay, sounds good. (And with a mild thread-merge: yes, folks want to use
KASAN_HW_TAGS=y in production.)
Mark Rutland March 10, 2025, 6:13 p.m. UTC | #5
On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote:
> On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote:
> > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> > > The optimized strscpy() and dentry_string_cmp() routines will read 8
> > > unaligned bytes at a time via the function read_word_at_a_time(), but
> > > this is incompatible with MTE which will fault on a partially invalid
> > > read. The attributes on read_word_at_a_time() that disable KASAN are
> > > invisible to the CPU so they have no effect on MTE. Let's fix the
> > > bug for now by disabling the optimizations if the kernel is built
> > > with HW tag-based KASAN and consider improvements for followup changes.
> > 
> > Why is faulting on a partially invalid read a problem? It's still
> > invalid, so ... it should fault, yes? What am I missing?
> 
> read_word_at_a_time() is used to read 8 bytes, potentially unaligned and
> beyond the end of string. The has_zero() function is then used to check
> where the string ends. For this uses, I think we can go with
> load_unaligned_zeropad() which handles a potential fault and pads the
> rest with zeroes.

If we only care about synchronous and asymmetric modes, that should be
possible, but that won't work in asynchronous mode. In asynchronous mode
the fault will accumulate into TFSR and will be detected later
asynchronously where it cannot be related to its source and fixed up.

That means that both read_word_at_a_time() and load_unaligned_zeropad()
are dodgy in async mode.

Can we somehow hang this off ARCH_HAS_SUBPAGE_FAULTS?

... and is there anything else that deliberately makes accesses that
could straddle objects?

Mark.
Catalin Marinas March 10, 2025, 6:40 p.m. UTC | #6
On Mon, Mar 10, 2025 at 06:13:58PM +0000, Mark Rutland wrote:
> On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote:
> > On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote:
> > > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> > > > The optimized strscpy() and dentry_string_cmp() routines will read 8
> > > > unaligned bytes at a time via the function read_word_at_a_time(), but
> > > > this is incompatible with MTE which will fault on a partially invalid
> > > > read. The attributes on read_word_at_a_time() that disable KASAN are
> > > > invisible to the CPU so they have no effect on MTE. Let's fix the
> > > > bug for now by disabling the optimizations if the kernel is built
> > > > with HW tag-based KASAN and consider improvements for followup changes.
> > > 
> > > Why is faulting on a partially invalid read a problem? It's still
> > > invalid, so ... it should fault, yes? What am I missing?
> > 
> > read_word_at_a_time() is used to read 8 bytes, potentially unaligned and
> > beyond the end of string. The has_zero() function is then used to check
> > where the string ends. For this uses, I think we can go with
> > load_unaligned_zeropad() which handles a potential fault and pads the
> > rest with zeroes.
> 
> If we only care about synchronous and asymmetric modes, that should be
> possible, but that won't work in asynchronous mode. In asynchronous mode
> the fault will accumulate into TFSR and will be detected later
> asynchronously where it cannot be related to its source and fixed up.
> 
> That means that both read_word_at_a_time() and load_unaligned_zeropad()
> are dodgy in async mode.

load_unaligned_zeropad() has a __mte_enable_tco_async() call to set
PSTATE.TCO if in async mode, so that's covered. read_word_at_a_time() is
indeed busted and I've had Vincezo's patches for a couple of years
already, they just never made it to the list.

> Can we somehow hang this off ARCH_HAS_SUBPAGE_FAULTS?

We could, though that was mostly for user-space faults while in-kernel
we'd only need something similar if KASAN_HW_TAGS.

> ... and is there anything else that deliberately makes accesses that
> could straddle objects?

So far we only came across load_unaligned_zeropad() and
read_word_at_a_time(). I'm not aware of anything else.
Mark Rutland March 10, 2025, 7:37 p.m. UTC | #7
On Mon, Mar 10, 2025 at 06:40:11PM +0000, Catalin Marinas wrote:
> On Mon, Mar 10, 2025 at 06:13:58PM +0000, Mark Rutland wrote:
> > On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote:
> > > On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote:
> > > > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> > > > > The optimized strscpy() and dentry_string_cmp() routines will read 8
> > > > > unaligned bytes at a time via the function read_word_at_a_time(), but
> > > > > this is incompatible with MTE which will fault on a partially invalid
> > > > > read. The attributes on read_word_at_a_time() that disable KASAN are
> > > > > invisible to the CPU so they have no effect on MTE. Let's fix the
> > > > > bug for now by disabling the optimizations if the kernel is built
> > > > > with HW tag-based KASAN and consider improvements for followup changes.
> > > > 
> > > > Why is faulting on a partially invalid read a problem? It's still
> > > > invalid, so ... it should fault, yes? What am I missing?
> > > 
> > > read_word_at_a_time() is used to read 8 bytes, potentially unaligned and
> > > beyond the end of string. The has_zero() function is then used to check
> > > where the string ends. For this uses, I think we can go with
> > > load_unaligned_zeropad() which handles a potential fault and pads the
> > > rest with zeroes.
> > 
> > If we only care about synchronous and asymmetric modes, that should be
> > possible, but that won't work in asynchronous mode. In asynchronous mode
> > the fault will accumulate into TFSR and will be detected later
> > asynchronously where it cannot be related to its source and fixed up.
> > 
> > That means that both read_word_at_a_time() and load_unaligned_zeropad()
> > are dodgy in async mode.
> 
> load_unaligned_zeropad() has a __mte_enable_tco_async() call to set
> PSTATE.TCO if in async mode, so that's covered. read_word_at_a_time() is
> indeed busted and I've had Vincezo's patches for a couple of years
> already, they just never made it to the list.

Sorry, I missed the __mte_{enable,disable}_tco_async() calls. So long as
we're happy to omit the check in that case, that's fine.

I was worried that ex_handler_load_unaligned_zeropad() might not do the
right thing in response to a tag check fault (e.g. access the wrong 8
bytes), but it looks as though that's ok due to the way it generates the
offset and the aligned pointer.

If load_unaligned_zeropad() is handed a string that starts with an
unexpected tag (and even if that starts off aligned),
ex_handler_load_unaligned_zeropad() will access that and cause another
tag check fault, which will be reported.

Mark.
Catalin Marinas March 11, 2025, 11:45 a.m. UTC | #8
On Mon, Mar 10, 2025 at 07:37:32PM +0000, Mark Rutland wrote:
> On Mon, Mar 10, 2025 at 06:40:11PM +0000, Catalin Marinas wrote:
> > On Mon, Mar 10, 2025 at 06:13:58PM +0000, Mark Rutland wrote:
> > > On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote:
> > > > On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote:
> > > > > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
> > > > > > The optimized strscpy() and dentry_string_cmp() routines will read 8
> > > > > > unaligned bytes at a time via the function read_word_at_a_time(), but
> > > > > > this is incompatible with MTE which will fault on a partially invalid
> > > > > > read. The attributes on read_word_at_a_time() that disable KASAN are
> > > > > > invisible to the CPU so they have no effect on MTE. Let's fix the
> > > > > > bug for now by disabling the optimizations if the kernel is built
> > > > > > with HW tag-based KASAN and consider improvements for followup changes.
> > > > > 
> > > > > Why is faulting on a partially invalid read a problem? It's still
> > > > > invalid, so ... it should fault, yes? What am I missing?
> > > > 
> > > > read_word_at_a_time() is used to read 8 bytes, potentially unaligned and
> > > > beyond the end of string. The has_zero() function is then used to check
> > > > where the string ends. For this uses, I think we can go with
> > > > load_unaligned_zeropad() which handles a potential fault and pads the
> > > > rest with zeroes.
> > > 
> > > If we only care about synchronous and asymmetric modes, that should be
> > > possible, but that won't work in asynchronous mode. In asynchronous mode
> > > the fault will accumulate into TFSR and will be detected later
> > > asynchronously where it cannot be related to its source and fixed up.
> > > 
> > > That means that both read_word_at_a_time() and load_unaligned_zeropad()
> > > are dodgy in async mode.
> > 
> > load_unaligned_zeropad() has a __mte_enable_tco_async() call to set
> > PSTATE.TCO if in async mode, so that's covered. read_word_at_a_time() is
> > indeed busted and I've had Vincezo's patches for a couple of years
> > already, they just never made it to the list.
> 
> Sorry, I missed the __mte_{enable,disable}_tco_async() calls. So long as
> we're happy to omit the check in that case, that's fine.

That was the easiest. Alternatively we can try to sync the TFSR before
and after the load but with the ISBs, that's too expensive. We could
also do a dummy one byte load before setting TCO. read_word_at_a_time()
does have an explicit kasan_check_read() but last time I checked it has
no effect on MTE.

> I was worried that ex_handler_load_unaligned_zeropad() might not do the
> right thing in response to a tag check fault (e.g. access the wrong 8
> bytes), but it looks as though that's ok due to the way it generates the
> offset and the aligned pointer.
> 
> If load_unaligned_zeropad() is handed a string that starts with an
> unexpected tag (and even if that starts off aligned),
> ex_handler_load_unaligned_zeropad() will access that and cause another
> tag check fault, which will be reported.

Yes, it will report an async tag check fault on the
exit_to_kernel_mode() path _if_ load_unaligned_zeropad() triggered the
fault for other reasons (end of page). It's slightly inconsistent, we
could set TCO for the async case in ex_handler_load_unaligned_zeropad()
as well.

For sync checks, we'd get the first fault ending up in
ex_handler_load_unaligned_zeropad() and a second tag check fault while
processing the first. This ends up in do_tag_recovery and we disable tag
checking after the report. Not ideal but not that bad. We could adjust
ex_handler_load_unaligned_zeropad() to return false if the pointer is
already aligned but we need to check the semantics of
load_unaligned_zeropad(), is it allowed to fault on the first byte?
Mark Rutland March 11, 2025, 11:55 a.m. UTC | #9
On Tue, Mar 11, 2025 at 11:45:21AM +0000, Catalin Marinas wrote:
> On Mon, Mar 10, 2025 at 07:37:32PM +0000, Mark Rutland wrote:
> > I was worried that ex_handler_load_unaligned_zeropad() might not do the
> > right thing in response to a tag check fault (e.g. access the wrong 8
> > bytes), but it looks as though that's ok due to the way it generates the
> > offset and the aligned pointer.
> > 
> > If load_unaligned_zeropad() is handed a string that starts with an
> > unexpected tag (and even if that starts off aligned),
> > ex_handler_load_unaligned_zeropad() will access that and cause another
> > tag check fault, which will be reported.
> 
> Yes, it will report an async tag check fault on the
> exit_to_kernel_mode() path _if_ load_unaligned_zeropad() triggered the
> fault for other reasons (end of page).

Sorry, yes. The aligned case I mentioned shouldn't apply here.

> It's slightly inconsistent, we could set TCO for the async case in
> ex_handler_load_unaligned_zeropad() as well.

Yep, I think that'd be necessary for async mode.

> For sync checks, we'd get the first fault ending up in
> ex_handler_load_unaligned_zeropad() and a second tag check fault while
> processing the first. This ends up in do_tag_recovery and we disable
> tag checking after the report. Not ideal but not that bad.

Yep; that's what I was describing in the second paragraph above, though
I forgot to say that was assuming sync or asymm mode.

> We could adjust ex_handler_load_unaligned_zeropad() to return false if
> the pointer is already aligned but we need to check the semantics of
> load_unaligned_zeropad(), is it allowed to fault on the first byte?

IIUC today it's only expected to fault due to misalignment, and the
gneral expectation is that for a sequence of load_unaligned_zeropad()
calls, we should get at least one byte without faulting (for the NUL
terminator).

I reckon it'd be better to figure this out based on the ESR if possible.
Kristina's patches for MOPS would give us that.

Mark.
diff mbox series

Patch

diff --git a/fs/dcache.c b/fs/dcache.c
index e3634916ffb93..71f0830ac5e69 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -223,7 +223,7 @@  fs_initcall(init_fs_dcache_sysctls);
  * Compare 2 name strings, return 0 if they match, otherwise non-zero.
  * The strings are both count bytes long, and count is non-zero.
  */
-#ifdef CONFIG_DCACHE_WORD_ACCESS
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && !defined(CONFIG_KASAN_HW_TAGS)
 
 #include <asm/word-at-a-time.h>
 /*
diff --git a/lib/string.c b/lib/string.c
index eb4486ed40d25..9a43a3824d0d7 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -119,7 +119,8 @@  ssize_t sized_strscpy(char *dest, const char *src, size_t count)
 	if (count == 0 || WARN_ON_ONCE(count > INT_MAX))
 		return -E2BIG;
 
-#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && \
+	!defined(CONFIG_KASAN_HW_TAGS)
 	/*
 	 * If src is unaligned, don't cross a page boundary,
 	 * since we don't know if the next page is mapped.