Message ID | 20241022015913.3524425-2-samuel.holland@sifive.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | [v2,1/9] kasan: sw_tags: Use arithmetic shift for shadow computation | expand |
On Tue, Oct 22, 2024 at 3:59 AM Samuel Holland <samuel.holland@sifive.com> wrote: > > Currently, kasan_mem_to_shadow() uses a logical right shift, which turns > canonical kernel addresses into non-canonical addresses by clearing the > high KASAN_SHADOW_SCALE_SHIFT bits. The value of KASAN_SHADOW_OFFSET is > then chosen so that the addition results in a canonical address for the > shadow memory. > > For KASAN_GENERIC, this shift/add combination is ABI with the compiler, > because KASAN_SHADOW_OFFSET is used in compiler-generated inline tag > checks[1], which must only attempt to dereference canonical addresses. > > However, for KASAN_SW_TAGS we have some freedom to change the algorithm > without breaking the ABI. Because TBI is enabled for kernel addresses, > the top bits of shadow memory addresses computed during tag checks are > irrelevant, and so likewise are the top bits of KASAN_SHADOW_OFFSET. > This is demonstrated by the fact that LLVM uses a logical right shift > in the tag check fast path[2] but a sbfx (signed bitfield extract) > instruction in the slow path[3] without causing any issues. > > Using an arithmetic shift in kasan_mem_to_shadow() provides a number of > benefits: > > 1) The memory layout is easier to understand. KASAN_SHADOW_OFFSET > becomes a canonical memory address, and the shifted pointer becomes a > negative offset, so KASAN_SHADOW_OFFSET == KASAN_SHADOW_END regardless > of the shift amount or the size of the virtual address space. > > 2) KASAN_SHADOW_OFFSET becomes a simpler constant, requiring only one > instruction to load instead of two. Since it must be loaded in each > function with a tag check, this decreases kernel text size by 0.5%. > > 3) This shift and the sign extension from kasan_reset_tag() can be > combined into a single sbfx instruction. When this same algorithm change > is applied to the compiler, it removes an instruction from each inline > tag check, further reducing kernel text size by an additional 4.6%. > > These benefits extend to other architectures as well. On RISC-V, where > the baseline ISA does not shifted addition or have an equivalent to the > sbfx instruction, loading KASAN_SHADOW_OFFSET is reduced from 3 to 2 > instructions, and kasan_mem_to_shadow(kasan_reset_tag(addr)) similarly > combines two consecutive right shifts. > > Link: https://github.com/llvm/llvm-project/blob/llvmorg-20-init/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp#L1316 [1] > Link: https://github.com/llvm/llvm-project/blob/llvmorg-20-init/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp#L895 [2] > Link: https://github.com/llvm/llvm-project/blob/llvmorg-20-init/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp#L669 [3] > Signed-off-by: Samuel Holland <samuel.holland@sifive.com> > --- > > Changes in v2: > - Improve the explanation for how KASAN_SHADOW_END is derived > - Update the range check in kasan_non_canonical_hook() > > arch/arm64/Kconfig | 10 +++++----- > arch/arm64/include/asm/memory.h | 17 +++++++++++++++-- > arch/arm64/mm/kasan_init.c | 7 +++++-- > include/linux/kasan.h | 10 ++++++++-- > mm/kasan/report.c | 22 ++++++++++++++++++---- > scripts/gdb/linux/mm.py | 5 +++-- > 6 files changed, 54 insertions(+), 17 deletions(-) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index fd9df6dcc593..6a326908c941 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -418,11 +418,11 @@ config KASAN_SHADOW_OFFSET > default 0xdffffe0000000000 if ARM64_VA_BITS_42 && !KASAN_SW_TAGS > default 0xdfffffc000000000 if ARM64_VA_BITS_39 && !KASAN_SW_TAGS > default 0xdffffff800000000 if ARM64_VA_BITS_36 && !KASAN_SW_TAGS > - default 0xefff800000000000 if (ARM64_VA_BITS_48 || (ARM64_VA_BITS_52 && !ARM64_16K_PAGES)) && KASAN_SW_TAGS > - default 0xefffc00000000000 if (ARM64_VA_BITS_47 || ARM64_VA_BITS_52) && ARM64_16K_PAGES && KASAN_SW_TAGS > - default 0xeffffe0000000000 if ARM64_VA_BITS_42 && KASAN_SW_TAGS > - default 0xefffffc000000000 if ARM64_VA_BITS_39 && KASAN_SW_TAGS > - default 0xeffffff800000000 if ARM64_VA_BITS_36 && KASAN_SW_TAGS > + default 0xffff800000000000 if (ARM64_VA_BITS_48 || (ARM64_VA_BITS_52 && !ARM64_16K_PAGES)) && KASAN_SW_TAGS > + default 0xffffc00000000000 if (ARM64_VA_BITS_47 || ARM64_VA_BITS_52) && ARM64_16K_PAGES && KASAN_SW_TAGS > + default 0xfffffe0000000000 if ARM64_VA_BITS_42 && KASAN_SW_TAGS > + default 0xffffffc000000000 if ARM64_VA_BITS_39 && KASAN_SW_TAGS > + default 0xfffffff800000000 if ARM64_VA_BITS_36 && KASAN_SW_TAGS > default 0xffffffffffffffff > > config UNWIND_TABLES > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h > index 0480c61dbb4f..a93fc9dc16f3 100644 > --- a/arch/arm64/include/asm/memory.h > +++ b/arch/arm64/include/asm/memory.h > @@ -80,7 +80,8 @@ > * where KASAN_SHADOW_SCALE_SHIFT is the order of the number of bits that map > * to a single shadow byte and KASAN_SHADOW_OFFSET is a constant that offsets > * the mapping. Note that KASAN_SHADOW_OFFSET does not point to the start of > - * the shadow memory region. > + * the shadow memory region, since not all possible addresses have shadow > + * memory allocated for them. I'm not sure this addition makes sense: the original statement was to point out that KASAN_SHADOW_OFFSET and KASAN_SHADOW_START are different values. Even if we were to map shadow for userspace, KASAN_SHADOW_OFFSET would still be a weird offset value for Generic KASAN. > * > * Based on this mapping, we define two constants: > * > @@ -89,7 +90,15 @@ > * > * KASAN_SHADOW_END is defined first as the shadow address that corresponds to > * the upper bound of possible virtual kernel memory addresses UL(1) << 64 > - * according to the mapping formula. > + * according to the mapping formula. For Generic KASAN, the address in the > + * mapping formula is treated as unsigned (part of the compiler's ABI), so the > + * end of the shadow memory region is at a large positive offset from > + * KASAN_SHADOW_OFFSET. For Software Tag-Based KASAN, the address in the > + * formula is treated as signed. Since all kernel addresses are negative, they > + * map to shadow memory below KASAN_SHADOW_OFFSET, making KASAN_SHADOW_OFFSET > + * itself the end of the shadow memory region. (User pointers are positive and > + * would map to shadow memory above KASAN_SHADOW_OFFSET, but shadow memory is > + * not allocated for them.) This looks good! > * > * KASAN_SHADOW_START is defined second based on KASAN_SHADOW_END. The shadow > * memory start must map to the lowest possible kernel virtual memory address > @@ -100,7 +109,11 @@ > */ > #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS) > #define KASAN_SHADOW_OFFSET _AC(CONFIG_KASAN_SHADOW_OFFSET, UL) > +#ifdef CONFIG_KASAN_GENERIC > #define KASAN_SHADOW_END ((UL(1) << (64 - KASAN_SHADOW_SCALE_SHIFT)) + KASAN_SHADOW_OFFSET) > +#else > +#define KASAN_SHADOW_END KASAN_SHADOW_OFFSET > +#endif > #define _KASAN_SHADOW_START(va) (KASAN_SHADOW_END - (UL(1) << ((va) - KASAN_SHADOW_SCALE_SHIFT))) > #define KASAN_SHADOW_START _KASAN_SHADOW_START(vabits_actual) > #define PAGE_END KASAN_SHADOW_START > diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c > index b65a29440a0c..6836e571555c 100644 > --- a/arch/arm64/mm/kasan_init.c > +++ b/arch/arm64/mm/kasan_init.c > @@ -198,8 +198,11 @@ static bool __init root_level_aligned(u64 addr) > /* The early shadow maps everything to a single page of zeroes */ > asmlinkage void __init kasan_early_init(void) > { > - BUILD_BUG_ON(KASAN_SHADOW_OFFSET != > - KASAN_SHADOW_END - (1UL << (64 - KASAN_SHADOW_SCALE_SHIFT))); > + if (IS_ENABLED(CONFIG_KASAN_GENERIC)) > + BUILD_BUG_ON(KASAN_SHADOW_OFFSET != > + KASAN_SHADOW_END - (1UL << (64 - KASAN_SHADOW_SCALE_SHIFT))); > + else > + BUILD_BUG_ON(KASAN_SHADOW_OFFSET != KASAN_SHADOW_END); > BUILD_BUG_ON(!IS_ALIGNED(_KASAN_SHADOW_START(VA_BITS), SHADOW_ALIGN)); > BUILD_BUG_ON(!IS_ALIGNED(_KASAN_SHADOW_START(VA_BITS_MIN), SHADOW_ALIGN)); > BUILD_BUG_ON(!IS_ALIGNED(KASAN_SHADOW_END, SHADOW_ALIGN)); > diff --git a/include/linux/kasan.h b/include/linux/kasan.h > index 00a3bf7c0d8f..03b440658817 100644 > --- a/include/linux/kasan.h > +++ b/include/linux/kasan.h > @@ -58,8 +58,14 @@ int kasan_populate_early_shadow(const void *shadow_start, > #ifndef kasan_mem_to_shadow > static inline void *kasan_mem_to_shadow(const void *addr) > { > - return (void *)((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT) > - + KASAN_SHADOW_OFFSET; > + void *scaled; > + > + if (IS_ENABLED(CONFIG_KASAN_GENERIC)) > + scaled = (void *)((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT); > + else > + scaled = (void *)((long)addr >> KASAN_SHADOW_SCALE_SHIFT); > + > + return KASAN_SHADOW_OFFSET + scaled; > } > #endif > > diff --git a/mm/kasan/report.c b/mm/kasan/report.c > index b48c768acc84..c08097715686 100644 > --- a/mm/kasan/report.c > +++ b/mm/kasan/report.c > @@ -644,15 +644,29 @@ void kasan_report_async(void) > */ > void kasan_non_canonical_hook(unsigned long addr) > { > + unsigned long max_shadow_size = BIT(BITS_PER_LONG - KASAN_SHADOW_SCALE_SHIFT); > unsigned long orig_addr; > const char *bug_type; > > /* > - * All addresses that came as a result of the memory-to-shadow mapping > - * (even for bogus pointers) must be >= KASAN_SHADOW_OFFSET. > + * With the default kasan_mem_to_shadow() algorithm, all addresses > + * returned by the memory-to-shadow mapping (even for bogus pointers) > + * must be within a certain displacement from KASAN_SHADOW_OFFSET. > + * > + * For Generic KASAN, the displacement is unsigned, so > + * KASAN_SHADOW_OFFSET is the smallest possible shadow address. For This part of the comment doesn't seem correct: KASAN_SHADOW_OFFSET is still a weird offset value for Generic KASAN, not the smallest possible shadow address. > + * Software Tag-Based KASAN, the displacement is signed, so > + * KASAN_SHADOW_OFFSET is the center of the range. > */ > - if (addr < KASAN_SHADOW_OFFSET) > - return; > + if (IS_ENABLED(CONFIG_KASAN_GENERIC)) { > + if (addr < KASAN_SHADOW_OFFSET || > + addr >= KASAN_SHADOW_OFFSET + max_shadow_size) > + return; > + } else { > + if (addr < KASAN_SHADOW_OFFSET - max_shadow_size / 2 || > + addr >= KASAN_SHADOW_OFFSET + max_shadow_size / 2) > + return; Hm, I might be wrong, but I think this check does not work. Let's say we have non-canonical address 0x4242424242424242 and number of VA bits is 48. Then: KASAN_SHADOW_OFFSET == 0xffff800000000000 kasan_mem_to_shadow(0x4242424242424242) == 0x0423a42424242424 max_shadow_size == 0x1000000000000000 KASAN_SHADOW_OFFSET - max_shadow_size / 2 == 0xf7ff800000000000 KASAN_SHADOW_OFFSET + max_shadow_size / 2 == 0x07ff800000000000 (overflows) 0x0423a42424242424 is < than 0xf7ff800000000000, so the function will wrongly return. > + } > > orig_addr = (unsigned long)kasan_shadow_to_mem((void *)addr); > Just to double-check: kasan_shadow_to_mem() and addr_has_metadata() don't need any changes, right? > diff --git a/scripts/gdb/linux/mm.py b/scripts/gdb/linux/mm.py > index 7571aebbe650..2e63f3dedd53 100644 > --- a/scripts/gdb/linux/mm.py > +++ b/scripts/gdb/linux/mm.py > @@ -110,12 +110,13 @@ class aarch64_page_ops(): > self.KERNEL_END = gdb.parse_and_eval("_end") > > if constants.LX_CONFIG_KASAN_GENERIC or constants.LX_CONFIG_KASAN_SW_TAGS: > + self.KASAN_SHADOW_OFFSET = constants.LX_CONFIG_KASAN_SHADOW_OFFSET > if constants.LX_CONFIG_KASAN_GENERIC: > self.KASAN_SHADOW_SCALE_SHIFT = 3 > + self.KASAN_SHADOW_END = (1 << (64 - self.KASAN_SHADOW_SCALE_SHIFT)) + self.KASAN_SHADOW_OFFSET > else: > self.KASAN_SHADOW_SCALE_SHIFT = 4 > - self.KASAN_SHADOW_OFFSET = constants.LX_CONFIG_KASAN_SHADOW_OFFSET > - self.KASAN_SHADOW_END = (1 << (64 - self.KASAN_SHADOW_SCALE_SHIFT)) + self.KASAN_SHADOW_OFFSET > + self.KASAN_SHADOW_END = self.KASAN_SHADOW_OFFSET > self.PAGE_END = self.KASAN_SHADOW_END - (1 << (self.vabits_actual - self.KASAN_SHADOW_SCALE_SHIFT)) > else: > self.PAGE_END = self._PAGE_END(self.VA_BITS_MIN) > -- > 2.45.1 > Could you also check that everything works when CONFIG_KASAN_SW_TAGS + CONFIG_KASAN_OUTLINE? I think it should, be makes sense to confirm. Thank you!
On 2024-10-23 at 20:41:57 +0200, Andrey Konovalov wrote: >On Tue, Oct 22, 2024 at 3:59 AM Samuel Holland ><samuel.holland@sifive.com> wrote: ... >> + * Software Tag-Based KASAN, the displacement is signed, so >> + * KASAN_SHADOW_OFFSET is the center of the range. >> */ >> - if (addr < KASAN_SHADOW_OFFSET) >> - return; >> + if (IS_ENABLED(CONFIG_KASAN_GENERIC)) { >> + if (addr < KASAN_SHADOW_OFFSET || >> + addr >= KASAN_SHADOW_OFFSET + max_shadow_size) >> + return; >> + } else { >> + if (addr < KASAN_SHADOW_OFFSET - max_shadow_size / 2 || >> + addr >= KASAN_SHADOW_OFFSET + max_shadow_size / 2) >> + return; > >Hm, I might be wrong, but I think this check does not work. > >Let's say we have non-canonical address 0x4242424242424242 and number >of VA bits is 48. > >Then: > >KASAN_SHADOW_OFFSET == 0xffff800000000000 >kasan_mem_to_shadow(0x4242424242424242) == 0x0423a42424242424 >max_shadow_size == 0x1000000000000000 >KASAN_SHADOW_OFFSET - max_shadow_size / 2 == 0xf7ff800000000000 >KASAN_SHADOW_OFFSET + max_shadow_size / 2 == 0x07ff800000000000 (overflows) > >0x0423a42424242424 is < than 0xf7ff800000000000, so the function will >wrongly return. As I understand this check aims to figure out if the address landed in shadow space and if it didn't we can return. Can't this above snippet be a simple: if (!addr_in_shadow(addr)) return; ?
On 2025-02-10 at 16:22:41 +0100, Maciej Wieczor-Retman wrote: >On 2024-10-23 at 20:41:57 +0200, Andrey Konovalov wrote: >>On Tue, Oct 22, 2024 at 3:59 AM Samuel Holland >><samuel.holland@sifive.com> wrote: >... >>> + * Software Tag-Based KASAN, the displacement is signed, so >>> + * KASAN_SHADOW_OFFSET is the center of the range. >>> */ >>> - if (addr < KASAN_SHADOW_OFFSET) >>> - return; >>> + if (IS_ENABLED(CONFIG_KASAN_GENERIC)) { >>> + if (addr < KASAN_SHADOW_OFFSET || >>> + addr >= KASAN_SHADOW_OFFSET + max_shadow_size) >>> + return; >>> + } else { >>> + if (addr < KASAN_SHADOW_OFFSET - max_shadow_size / 2 || >>> + addr >= KASAN_SHADOW_OFFSET + max_shadow_size / 2) >>> + return; >> >>Hm, I might be wrong, but I think this check does not work. >> >>Let's say we have non-canonical address 0x4242424242424242 and number >>of VA bits is 48. >> >>Then: >> >>KASAN_SHADOW_OFFSET == 0xffff800000000000 >>kasan_mem_to_shadow(0x4242424242424242) == 0x0423a42424242424 >>max_shadow_size == 0x1000000000000000 >>KASAN_SHADOW_OFFSET - max_shadow_size / 2 == 0xf7ff800000000000 >>KASAN_SHADOW_OFFSET + max_shadow_size / 2 == 0x07ff800000000000 (overflows) >> >>0x0423a42424242424 is < than 0xf7ff800000000000, so the function will >>wrongly return. > >As I understand this check aims to figure out if the address landed in shadow >space and if it didn't we can return. > >Can't this above snippet be a simple: > > if (!addr_in_shadow(addr)) > return; > >? Sorry, I think this wouldn't work. The tag also needs to be reset. Does this perhaps work for this problem? if (!addr_in_shadow(kasan_reset_tag((void *)addr))) return; > >-- >Kind regards >Maciej Wieczór-Retman
On Mon, Feb 10, 2025 at 4:53 PM Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> wrote: > > On 2025-02-10 at 16:22:41 +0100, Maciej Wieczor-Retman wrote: > >On 2024-10-23 at 20:41:57 +0200, Andrey Konovalov wrote: > >>On Tue, Oct 22, 2024 at 3:59 AM Samuel Holland > >><samuel.holland@sifive.com> wrote: > >... > >>> + * Software Tag-Based KASAN, the displacement is signed, so > >>> + * KASAN_SHADOW_OFFSET is the center of the range. > >>> */ > >>> - if (addr < KASAN_SHADOW_OFFSET) > >>> - return; > >>> + if (IS_ENABLED(CONFIG_KASAN_GENERIC)) { > >>> + if (addr < KASAN_SHADOW_OFFSET || > >>> + addr >= KASAN_SHADOW_OFFSET + max_shadow_size) > >>> + return; > >>> + } else { > >>> + if (addr < KASAN_SHADOW_OFFSET - max_shadow_size / 2 || > >>> + addr >= KASAN_SHADOW_OFFSET + max_shadow_size / 2) > >>> + return; > >> > >>Hm, I might be wrong, but I think this check does not work. > >> > >>Let's say we have non-canonical address 0x4242424242424242 and number > >>of VA bits is 48. > >> > >>Then: > >> > >>KASAN_SHADOW_OFFSET == 0xffff800000000000 > >>kasan_mem_to_shadow(0x4242424242424242) == 0x0423a42424242424 > >>max_shadow_size == 0x1000000000000000 > >>KASAN_SHADOW_OFFSET - max_shadow_size / 2 == 0xf7ff800000000000 > >>KASAN_SHADOW_OFFSET + max_shadow_size / 2 == 0x07ff800000000000 (overflows) > >> > >>0x0423a42424242424 is < than 0xf7ff800000000000, so the function will > >>wrongly return. > > > >As I understand this check aims to figure out if the address landed in shadow > >space and if it didn't we can return. > > > >Can't this above snippet be a simple: > > > > if (!addr_in_shadow(addr)) > > return; > > > >? > > Sorry, I think this wouldn't work. The tag also needs to be reset. Does this > perhaps work for this problem? > > if (!addr_in_shadow(kasan_reset_tag((void *)addr))) > return; This wouldn't work as well. addr_in_shadow() checks whether an address belongs to the proper shadow memory area. That area is the result of the memory-to-shadow mapping applied to the range of proper kernel addresses. However, what we want to check in this function is whether the given address can be the result of the memory-to-shadow mapping for some memory address, including userspace addresses, non-canonical addresses, etc. So essentially we need to check whether the given address belongs to the area that is the result of the memory-to-shadow mapping applied to the whole address space, not only to proper kernel addresses.
On 2025-02-10 at 23:57:10 +0100, Andrey Konovalov wrote: >On Mon, Feb 10, 2025 at 4:53 PM Maciej Wieczor-Retman ><maciej.wieczor-retman@intel.com> wrote: >> >> On 2025-02-10 at 16:22:41 +0100, Maciej Wieczor-Retman wrote: >> >On 2024-10-23 at 20:41:57 +0200, Andrey Konovalov wrote: >> >>On Tue, Oct 22, 2024 at 3:59 AM Samuel Holland >> >><samuel.holland@sifive.com> wrote: >> >... >> >>> + * Software Tag-Based KASAN, the displacement is signed, so >> >>> + * KASAN_SHADOW_OFFSET is the center of the range. >> >>> */ >> >>> - if (addr < KASAN_SHADOW_OFFSET) >> >>> - return; >> >>> + if (IS_ENABLED(CONFIG_KASAN_GENERIC)) { >> >>> + if (addr < KASAN_SHADOW_OFFSET || >> >>> + addr >= KASAN_SHADOW_OFFSET + max_shadow_size) >> >>> + return; >> >>> + } else { >> >>> + if (addr < KASAN_SHADOW_OFFSET - max_shadow_size / 2 || >> >>> + addr >= KASAN_SHADOW_OFFSET + max_shadow_size / 2) >> >>> + return; >> >> >> >>Hm, I might be wrong, but I think this check does not work. >> >> >> >>Let's say we have non-canonical address 0x4242424242424242 and number >> >>of VA bits is 48. >> >> >> >>Then: >> >> >> >>KASAN_SHADOW_OFFSET == 0xffff800000000000 >> >>kasan_mem_to_shadow(0x4242424242424242) == 0x0423a42424242424 >> >>max_shadow_size == 0x1000000000000000 >> >>KASAN_SHADOW_OFFSET - max_shadow_size / 2 == 0xf7ff800000000000 >> >>KASAN_SHADOW_OFFSET + max_shadow_size / 2 == 0x07ff800000000000 (overflows) >> >> >> >>0x0423a42424242424 is < than 0xf7ff800000000000, so the function will >> >>wrongly return. >> > >> >As I understand this check aims to figure out if the address landed in shadow >> >space and if it didn't we can return. >> > >> >Can't this above snippet be a simple: >> > >> > if (!addr_in_shadow(addr)) >> > return; >> > >> >? >> >> Sorry, I think this wouldn't work. The tag also needs to be reset. Does this >> perhaps work for this problem? >> >> if (!addr_in_shadow(kasan_reset_tag((void *)addr))) >> return; > >This wouldn't work as well. > >addr_in_shadow() checks whether an address belongs to the proper >shadow memory area. That area is the result of the memory-to-shadow >mapping applied to the range of proper kernel addresses. > >However, what we want to check in this function is whether the given >address can be the result of the memory-to-shadow mapping for some >memory address, including userspace addresses, non-canonical >addresses, etc. So essentially we need to check whether the given >address belongs to the area that is the result of the memory-to-shadow >mapping applied to the whole address space, not only to proper kernel >addresses.k Ah, okay, I get it. Would the old version if (addr < KASAN_SHADOW_OFFSET) return; work if the *addr* had kasan_reset_tag() around it? That would sort of re-unsign the address only for the purpose of the if(). Also I was thinking about it because x86 even with address masking enabled keeps bit 63 set, so all kernel addresses will be negative in the signed kasan_mem_to_shadow(). That's great for simplifying the KASAN_SHADOW_OFFSET but it differs from the TBI and risc-v ideas where half of addresses are negative, hald positive. So the temporary re-unsigning could maybe make it simpler for x86 and avoid adding separate cases or alternative kasan_non_canonical_hook() implementation.
On 2025-02-11 at 09:58:22 +0100, Maciej Wieczor-Retman wrote: >On 2025-02-10 at 23:57:10 +0100, Andrey Konovalov wrote: >>On Mon, Feb 10, 2025 at 4:53 PM Maciej Wieczor-Retman >><maciej.wieczor-retman@intel.com> wrote: >>> >>> On 2025-02-10 at 16:22:41 +0100, Maciej Wieczor-Retman wrote: >>> >On 2024-10-23 at 20:41:57 +0200, Andrey Konovalov wrote: >>> >>On Tue, Oct 22, 2024 at 3:59 AM Samuel Holland >>> >><samuel.holland@sifive.com> wrote: >>> >... >>> >>> + * Software Tag-Based KASAN, the displacement is signed, so >>> >>> + * KASAN_SHADOW_OFFSET is the center of the range. >>> >>> */ >>> >>> - if (addr < KASAN_SHADOW_OFFSET) >>> >>> - return; >>> >>> + if (IS_ENABLED(CONFIG_KASAN_GENERIC)) { >>> >>> + if (addr < KASAN_SHADOW_OFFSET || >>> >>> + addr >= KASAN_SHADOW_OFFSET + max_shadow_size) >>> >>> + return; >>> >>> + } else { >>> >>> + if (addr < KASAN_SHADOW_OFFSET - max_shadow_size / 2 || >>> >>> + addr >= KASAN_SHADOW_OFFSET + max_shadow_size / 2) >>> >>> + return; >>> >> >>> >>Hm, I might be wrong, but I think this check does not work. >>> >> >>> >>Let's say we have non-canonical address 0x4242424242424242 and number >>> >>of VA bits is 48. >>> >> >>> >>Then: >>> >> >>> >>KASAN_SHADOW_OFFSET == 0xffff800000000000 >>> >>kasan_mem_to_shadow(0x4242424242424242) == 0x0423a42424242424 >>> >>max_shadow_size == 0x1000000000000000 >>> >>KASAN_SHADOW_OFFSET - max_shadow_size / 2 == 0xf7ff800000000000 >>> >>KASAN_SHADOW_OFFSET + max_shadow_size / 2 == 0x07ff800000000000 (overflows) >>> >> >>> >>0x0423a42424242424 is < than 0xf7ff800000000000, so the function will >>> >>wrongly return. >>> > >>> >As I understand this check aims to figure out if the address landed in shadow >>> >space and if it didn't we can return. >>> > >>> >Can't this above snippet be a simple: >>> > >>> > if (!addr_in_shadow(addr)) >>> > return; >>> > >>> >? >>> >>> Sorry, I think this wouldn't work. The tag also needs to be reset. Does this >>> perhaps work for this problem? >>> >>> if (!addr_in_shadow(kasan_reset_tag((void *)addr))) >>> return; >> >>This wouldn't work as well. >> >>addr_in_shadow() checks whether an address belongs to the proper >>shadow memory area. That area is the result of the memory-to-shadow >>mapping applied to the range of proper kernel addresses. >> >>However, what we want to check in this function is whether the given >>address can be the result of the memory-to-shadow mapping for some >>memory address, including userspace addresses, non-canonical >>addresses, etc. So essentially we need to check whether the given >>address belongs to the area that is the result of the memory-to-shadow >>mapping applied to the whole address space, not only to proper kernel >>addresses.k > >Ah, okay, I get it. Would the old version > > if (addr < KASAN_SHADOW_OFFSET) > return; > >work if the *addr* had kasan_reset_tag() around it? That would sort of re-unsign >the address only for the purpose of the if(). > >Also I was thinking about it because x86 even with address masking enabled keeps >bit 63 set, so all kernel addresses will be negative in the signed >kasan_mem_to_shadow(). That's great for simplifying the KASAN_SHADOW_OFFSET but >it differs from the TBI and risc-v ideas where half of addresses are negative, >hald positive. So the temporary re-unsigning could maybe make it simpler for x86 >and avoid adding separate cases or alternative kasan_non_canonical_hook() >implementation. Oh, nevermind, I see that this is more complicated than that. Sorry for the spam, I'll do some better calculations what is mapped where when doing kasan_mem_to_shadow() and maybe then I'll figure this out.
On 2025-02-10 at 23:57:10 +0100, Andrey Konovalov wrote: >On Mon, Feb 10, 2025 at 4:53 PM Maciej Wieczor-Retman ><maciej.wieczor-retman@intel.com> wrote: >> >> On 2025-02-10 at 16:22:41 +0100, Maciej Wieczor-Retman wrote: >> >On 2024-10-23 at 20:41:57 +0200, Andrey Konovalov wrote: >> >>On Tue, Oct 22, 2024 at 3:59 AM Samuel Holland >> >><samuel.holland@sifive.com> wrote: >> >... >> >>> + * Software Tag-Based KASAN, the displacement is signed, so >> >>> + * KASAN_SHADOW_OFFSET is the center of the range. >> >>> */ >> >>> - if (addr < KASAN_SHADOW_OFFSET) >> >>> - return; >> >>> + if (IS_ENABLED(CONFIG_KASAN_GENERIC)) { >> >>> + if (addr < KASAN_SHADOW_OFFSET || >> >>> + addr >= KASAN_SHADOW_OFFSET + max_shadow_size) >> >>> + return; >> >>> + } else { >> >>> + if (addr < KASAN_SHADOW_OFFSET - max_shadow_size / 2 || >> >>> + addr >= KASAN_SHADOW_OFFSET + max_shadow_size / 2) >> >>> + return; >> >> >> >>Hm, I might be wrong, but I think this check does not work. >> >> >> >>Let's say we have non-canonical address 0x4242424242424242 and number >> >>of VA bits is 48. >> >> >> >>Then: >> >> >> >>KASAN_SHADOW_OFFSET == 0xffff800000000000 >> >>kasan_mem_to_shadow(0x4242424242424242) == 0x0423a42424242424 >> >>max_shadow_size == 0x1000000000000000 >> >>KASAN_SHADOW_OFFSET - max_shadow_size / 2 == 0xf7ff800000000000 >> >>KASAN_SHADOW_OFFSET + max_shadow_size / 2 == 0x07ff800000000000 (overflows) >> >> >> >>0x0423a42424242424 is < than 0xf7ff800000000000, so the function will >> >>wrongly return. >> > >> >As I understand this check aims to figure out if the address landed in shadow >> >space and if it didn't we can return. >> > >> >Can't this above snippet be a simple: >> > >> > if (!addr_in_shadow(addr)) >> > return; >> > >> >? >> >> Sorry, I think this wouldn't work. The tag also needs to be reset. Does this >> perhaps work for this problem? >> >> if (!addr_in_shadow(kasan_reset_tag((void *)addr))) >> return; > >This wouldn't work as well. > >addr_in_shadow() checks whether an address belongs to the proper >shadow memory area. That area is the result of the memory-to-shadow >mapping applied to the range of proper kernel addresses. > >However, what we want to check in this function is whether the given >address can be the result of the memory-to-shadow mapping for some >memory address, including userspace addresses, non-canonical >addresses, etc. So essentially we need to check whether the given >address belongs to the area that is the result of the memory-to-shadow >mapping applied to the whole address space, not only to proper kernel >addresses. I did some experiments with multiple addresses passed through kasan_mem_to_shadow(). And it seems like we can get almost any address out when we consider any random bogus pointers. I used the KASAN_SHADOW_OFFSET from your example above. Userspace addresses seem to map to the range [KASAN_SHADOW_OFFSET - 0xffff8fffffffffff]. Then going through non-canonical addresses until 0x0007ffffffffffff we reach the end of kernel LA and we loop around. Then the addresses seem to go from 0 until we again start reaching the kernel space and then it maps into the proper shadow memory. It gave me the same results when using the previous version of kasan_mem_to_shadow() so I'm wondering whether I'm doing this experiment incorrectly or if there aren't any addresses we can rule out here?
On Tue, Feb 11, 2025 at 7:07 PM Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> wrote: > > I did some experiments with multiple addresses passed through > kasan_mem_to_shadow(). And it seems like we can get almost any address out when > we consider any random bogus pointers. > > I used the KASAN_SHADOW_OFFSET from your example above. Userspace addresses seem > to map to the range [KASAN_SHADOW_OFFSET - 0xffff8fffffffffff]. Then going > through non-canonical addresses until 0x0007ffffffffffff we reach the end of > kernel LA and we loop around. Then the addresses seem to go from 0 until we > again start reaching the kernel space and then it maps into the proper shadow > memory. > > It gave me the same results when using the previous version of > kasan_mem_to_shadow() so I'm wondering whether I'm doing this experiment > incorrectly or if there aren't any addresses we can rule out here? By the definition of the shadow mapping, if we apply that mapping to the whole 64-bit address space, the result will only contain 1/8th (1/16th for SW/HW_TAGS) of that space. For example, with the current upstream value of KASAN_SHADOW_OFFSET on x86 and arm64, the value of the top 3 bits (4 for SW/HW_TAGS) of any shadow address are always the same: KASAN_SHADOW_OFFSET's value is such that the shadow address calculation never overflows. Addresses that have a different value for those top 3 bits are the once we can rule out. The KASAN_SHADOW_OFFSET value from my example does rely on the overflow (arguably, this makes things more confusing [1]). But still, the possible values of shadow addresses should only cover 1/16th of the address space. So whether the address belongs to that 1/8th (1/16th) of the address space is what we want to check in kasan_non_canonical_hook(). The current upstream version of kasan_non_canonical_hook() actually does a simplified check by only checking for the lower bound (e.g. for x86, there's also an upper bound: KASAN_SHADOW_OFFSET + (0xffffffffffffffff >> 3) == 0xfffffbffffffffff), so we could improve it. [1] https://bugzilla.kernel.org/show_bug.cgi?id=218043
On Thu, Feb 13, 2025 at 2:21 AM Andrey Konovalov <andreyknvl@gmail.com> wrote: > > On Tue, Feb 11, 2025 at 7:07 PM Maciej Wieczor-Retman > <maciej.wieczor-retman@intel.com> wrote: > > > > I did some experiments with multiple addresses passed through > > kasan_mem_to_shadow(). And it seems like we can get almost any address out when > > we consider any random bogus pointers. > > > > I used the KASAN_SHADOW_OFFSET from your example above. Userspace addresses seem > > to map to the range [KASAN_SHADOW_OFFSET - 0xffff8fffffffffff]. Then going > > through non-canonical addresses until 0x0007ffffffffffff we reach the end of > > kernel LA and we loop around. Then the addresses seem to go from 0 until we > > again start reaching the kernel space and then it maps into the proper shadow > > memory. > > > > It gave me the same results when using the previous version of > > kasan_mem_to_shadow() so I'm wondering whether I'm doing this experiment > > incorrectly or if there aren't any addresses we can rule out here? > > By the definition of the shadow mapping, if we apply that mapping to > the whole 64-bit address space, the result will only contain 1/8th > (1/16th for SW/HW_TAGS) of that space. > > For example, with the current upstream value of KASAN_SHADOW_OFFSET on > x86 and arm64, the value of the top 3 bits (4 for SW/HW_TAGS) of any > shadow address are always the same: KASAN_SHADOW_OFFSET's value is > such that the shadow address calculation never overflows. Addresses > that have a different value for those top 3 bits are the once we can > rule out. Eh, scratch that, the 3rd bit from the top changes, as KASAN_SHADOW_OFFSET is not a that-well-aligned value, the overall size of the mapping holds. > The KASAN_SHADOW_OFFSET value from my example does rely on the > overflow (arguably, this makes things more confusing [1]). But still, > the possible values of shadow addresses should only cover 1/16th of > the address space. > > So whether the address belongs to that 1/8th (1/16th) of the address > space is what we want to check in kasan_non_canonical_hook(). > > The current upstream version of kasan_non_canonical_hook() actually > does a simplified check by only checking for the lower bound (e.g. for > x86, there's also an upper bound: KASAN_SHADOW_OFFSET + > (0xffffffffffffffff >> 3) == 0xfffffbffffffffff), so we could improve > it. > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=218043
On 2025-02-13 at 02:28:08 +0100, Andrey Konovalov wrote: >On Thu, Feb 13, 2025 at 2:21 AM Andrey Konovalov <andreyknvl@gmail.com> wrote: >> >> On Tue, Feb 11, 2025 at 7:07 PM Maciej Wieczor-Retman >> <maciej.wieczor-retman@intel.com> wrote: >> > >> > I did some experiments with multiple addresses passed through >> > kasan_mem_to_shadow(). And it seems like we can get almost any address out when >> > we consider any random bogus pointers. >> > >> > I used the KASAN_SHADOW_OFFSET from your example above. Userspace addresses seem >> > to map to the range [KASAN_SHADOW_OFFSET - 0xffff8fffffffffff]. Then going >> > through non-canonical addresses until 0x0007ffffffffffff we reach the end of >> > kernel LA and we loop around. Then the addresses seem to go from 0 until we >> > again start reaching the kernel space and then it maps into the proper shadow >> > memory. >> > >> > It gave me the same results when using the previous version of >> > kasan_mem_to_shadow() so I'm wondering whether I'm doing this experiment >> > incorrectly or if there aren't any addresses we can rule out here? >> >> By the definition of the shadow mapping, if we apply that mapping to >> the whole 64-bit address space, the result will only contain 1/8th >> (1/16th for SW/HW_TAGS) of that space. >> >> For example, with the current upstream value of KASAN_SHADOW_OFFSET on >> x86 and arm64, the value of the top 3 bits (4 for SW/HW_TAGS) of any >> shadow address are always the same: KASAN_SHADOW_OFFSET's value is >> such that the shadow address calculation never overflows. Addresses >> that have a different value for those top 3 bits are the once we can >> rule out. > >Eh, scratch that, the 3rd bit from the top changes, as >KASAN_SHADOW_OFFSET is not a that-well-aligned value, the overall size >of the mapping holds. > >> The KASAN_SHADOW_OFFSET value from my example does rely on the >> overflow (arguably, this makes things more confusing [1]). But still, >> the possible values of shadow addresses should only cover 1/16th of >> the address space. >> >> So whether the address belongs to that 1/8th (1/16th) of the address >> space is what we want to check in kasan_non_canonical_hook(). >> Right, I somehow forgot that obviously the whole LA has to map to 1/16th of the address space and it shold stay contiguous. After rethinking how the mapping worked before and will work after making stuff signed I thought this patch could make use of the overflow? From what I noticed, all the Kconfig values for KASAN_SHADOW_OFFSET should make it so there will be overflow when inputing more and more positive addresses. So maybe we should first find what the most negative and most positive (signed) addresses map to in shadow memory address space. And then when looking for invalid values that aren't the product of kasan_mem_to_shadow() we should check if (addr > kasan_mem_to_shadow(biggest_positive_address) && addr < kasan_mem_to_shadow(smallest_negative_address)) return; Is this correct? I think this works with TBI because valid kernel addresses depending on the top bit value will go both up and down from the KASAN_SHADOW_OFFSET. And the same will work for x86 where the (negative) kernel addresses are mapped down of KASAN_SHADOW_OFFSET and positive user addresses are mapped above and also overflow (in tag-based mode). >> The current upstream version of kasan_non_canonical_hook() actually >> does a simplified check by only checking for the lower bound (e.g. for >> x86, there's also an upper bound: KASAN_SHADOW_OFFSET + >> (0xffffffffffffffff >> 3) == 0xfffffbffffffffff), so we could improve >> it. Right, Samuel's check for generic KASAN seems to cover that case. >> >> [1] https://bugzilla.kernel.org/show_bug.cgi?id=218043 > >_______________________________________________ >linux-riscv mailing list >linux-riscv@lists.infradead.org >http://lists.infradead.org/mailman/listinfo/linux-riscv
On 2025-02-13 at 17:20:22 +0100, Maciej Wieczor-Retman wrote: >On 2025-02-13 at 02:28:08 +0100, Andrey Konovalov wrote: >>On Thu, Feb 13, 2025 at 2:21 AM Andrey Konovalov <andreyknvl@gmail.com> wrote: >>> >>> On Tue, Feb 11, 2025 at 7:07 PM Maciej Wieczor-Retman >>> <maciej.wieczor-retman@intel.com> wrote: >>> > >>> > I did some experiments with multiple addresses passed through >>> > kasan_mem_to_shadow(). And it seems like we can get almost any address out when >>> > we consider any random bogus pointers. >>> > >>> > I used the KASAN_SHADOW_OFFSET from your example above. Userspace addresses seem >>> > to map to the range [KASAN_SHADOW_OFFSET - 0xffff8fffffffffff]. Then going >>> > through non-canonical addresses until 0x0007ffffffffffff we reach the end of >>> > kernel LA and we loop around. Then the addresses seem to go from 0 until we >>> > again start reaching the kernel space and then it maps into the proper shadow >>> > memory. >>> > >>> > It gave me the same results when using the previous version of >>> > kasan_mem_to_shadow() so I'm wondering whether I'm doing this experiment >>> > incorrectly or if there aren't any addresses we can rule out here? >>> >>> By the definition of the shadow mapping, if we apply that mapping to >>> the whole 64-bit address space, the result will only contain 1/8th >>> (1/16th for SW/HW_TAGS) of that space. >>> >>> For example, with the current upstream value of KASAN_SHADOW_OFFSET on >>> x86 and arm64, the value of the top 3 bits (4 for SW/HW_TAGS) of any >>> shadow address are always the same: KASAN_SHADOW_OFFSET's value is >>> such that the shadow address calculation never overflows. Addresses >>> that have a different value for those top 3 bits are the once we can >>> rule out. >> >>Eh, scratch that, the 3rd bit from the top changes, as >>KASAN_SHADOW_OFFSET is not a that-well-aligned value, the overall size >>of the mapping holds. >> >>> The KASAN_SHADOW_OFFSET value from my example does rely on the >>> overflow (arguably, this makes things more confusing [1]). But still, >>> the possible values of shadow addresses should only cover 1/16th of >>> the address space. >>> >>> So whether the address belongs to that 1/8th (1/16th) of the address >>> space is what we want to check in kasan_non_canonical_hook(). >>> > >Right, I somehow forgot that obviously the whole LA has to map to 1/16th of the >address space and it shold stay contiguous. > >After rethinking how the mapping worked before and will work after making stuff >signed I thought this patch could make use of the overflow? > >From what I noticed, all the Kconfig values for KASAN_SHADOW_OFFSET should make >it so there will be overflow when inputing more and more positive addresses. > >So maybe we should first find what the most negative and most positive (signed) >addresses map to in shadow memory address space. And then when looking for >invalid values that aren't the product of kasan_mem_to_shadow() we should check > > if (addr > kasan_mem_to_shadow(biggest_positive_address) && > addr < kasan_mem_to_shadow(smallest_negative_address)) > return; > >Is this correct? I suppose the original code in the patch does the same thing when you change the || into &&: if (addr < KASAN_SHADOW_OFFSET - max_shadow_size / 2 && addr >= KASAN_SHADOW_OFFSET + max_shadow_size / 2) return; kasan_mem_to_shadow(0x7FFFFFFFFFFFFFFF) -> 0x07ff7fffffffffff kasan_mem_to_shadow(0x8000000000000000) -> 0xf7ff800000000000 Also after thinking about this overflow and what maps where I rechecked the kasan_shadow_to_mem() and addr_has_metadata() and they seem to return the values I'd expect without making any changes there. Just mentioning this because I recall you asked about it at the start of this thread.
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index fd9df6dcc593..6a326908c941 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -418,11 +418,11 @@ config KASAN_SHADOW_OFFSET default 0xdffffe0000000000 if ARM64_VA_BITS_42 && !KASAN_SW_TAGS default 0xdfffffc000000000 if ARM64_VA_BITS_39 && !KASAN_SW_TAGS default 0xdffffff800000000 if ARM64_VA_BITS_36 && !KASAN_SW_TAGS - default 0xefff800000000000 if (ARM64_VA_BITS_48 || (ARM64_VA_BITS_52 && !ARM64_16K_PAGES)) && KASAN_SW_TAGS - default 0xefffc00000000000 if (ARM64_VA_BITS_47 || ARM64_VA_BITS_52) && ARM64_16K_PAGES && KASAN_SW_TAGS - default 0xeffffe0000000000 if ARM64_VA_BITS_42 && KASAN_SW_TAGS - default 0xefffffc000000000 if ARM64_VA_BITS_39 && KASAN_SW_TAGS - default 0xeffffff800000000 if ARM64_VA_BITS_36 && KASAN_SW_TAGS + default 0xffff800000000000 if (ARM64_VA_BITS_48 || (ARM64_VA_BITS_52 && !ARM64_16K_PAGES)) && KASAN_SW_TAGS + default 0xffffc00000000000 if (ARM64_VA_BITS_47 || ARM64_VA_BITS_52) && ARM64_16K_PAGES && KASAN_SW_TAGS + default 0xfffffe0000000000 if ARM64_VA_BITS_42 && KASAN_SW_TAGS + default 0xffffffc000000000 if ARM64_VA_BITS_39 && KASAN_SW_TAGS + default 0xfffffff800000000 if ARM64_VA_BITS_36 && KASAN_SW_TAGS default 0xffffffffffffffff config UNWIND_TABLES diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index 0480c61dbb4f..a93fc9dc16f3 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -80,7 +80,8 @@ * where KASAN_SHADOW_SCALE_SHIFT is the order of the number of bits that map * to a single shadow byte and KASAN_SHADOW_OFFSET is a constant that offsets * the mapping. Note that KASAN_SHADOW_OFFSET does not point to the start of - * the shadow memory region. + * the shadow memory region, since not all possible addresses have shadow + * memory allocated for them. * * Based on this mapping, we define two constants: * @@ -89,7 +90,15 @@ * * KASAN_SHADOW_END is defined first as the shadow address that corresponds to * the upper bound of possible virtual kernel memory addresses UL(1) << 64 - * according to the mapping formula. + * according to the mapping formula. For Generic KASAN, the address in the + * mapping formula is treated as unsigned (part of the compiler's ABI), so the + * end of the shadow memory region is at a large positive offset from + * KASAN_SHADOW_OFFSET. For Software Tag-Based KASAN, the address in the + * formula is treated as signed. Since all kernel addresses are negative, they + * map to shadow memory below KASAN_SHADOW_OFFSET, making KASAN_SHADOW_OFFSET + * itself the end of the shadow memory region. (User pointers are positive and + * would map to shadow memory above KASAN_SHADOW_OFFSET, but shadow memory is + * not allocated for them.) * * KASAN_SHADOW_START is defined second based on KASAN_SHADOW_END. The shadow * memory start must map to the lowest possible kernel virtual memory address @@ -100,7 +109,11 @@ */ #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS) #define KASAN_SHADOW_OFFSET _AC(CONFIG_KASAN_SHADOW_OFFSET, UL) +#ifdef CONFIG_KASAN_GENERIC #define KASAN_SHADOW_END ((UL(1) << (64 - KASAN_SHADOW_SCALE_SHIFT)) + KASAN_SHADOW_OFFSET) +#else +#define KASAN_SHADOW_END KASAN_SHADOW_OFFSET +#endif #define _KASAN_SHADOW_START(va) (KASAN_SHADOW_END - (UL(1) << ((va) - KASAN_SHADOW_SCALE_SHIFT))) #define KASAN_SHADOW_START _KASAN_SHADOW_START(vabits_actual) #define PAGE_END KASAN_SHADOW_START diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c index b65a29440a0c..6836e571555c 100644 --- a/arch/arm64/mm/kasan_init.c +++ b/arch/arm64/mm/kasan_init.c @@ -198,8 +198,11 @@ static bool __init root_level_aligned(u64 addr) /* The early shadow maps everything to a single page of zeroes */ asmlinkage void __init kasan_early_init(void) { - BUILD_BUG_ON(KASAN_SHADOW_OFFSET != - KASAN_SHADOW_END - (1UL << (64 - KASAN_SHADOW_SCALE_SHIFT))); + if (IS_ENABLED(CONFIG_KASAN_GENERIC)) + BUILD_BUG_ON(KASAN_SHADOW_OFFSET != + KASAN_SHADOW_END - (1UL << (64 - KASAN_SHADOW_SCALE_SHIFT))); + else + BUILD_BUG_ON(KASAN_SHADOW_OFFSET != KASAN_SHADOW_END); BUILD_BUG_ON(!IS_ALIGNED(_KASAN_SHADOW_START(VA_BITS), SHADOW_ALIGN)); BUILD_BUG_ON(!IS_ALIGNED(_KASAN_SHADOW_START(VA_BITS_MIN), SHADOW_ALIGN)); BUILD_BUG_ON(!IS_ALIGNED(KASAN_SHADOW_END, SHADOW_ALIGN)); diff --git a/include/linux/kasan.h b/include/linux/kasan.h index 00a3bf7c0d8f..03b440658817 100644 --- a/include/linux/kasan.h +++ b/include/linux/kasan.h @@ -58,8 +58,14 @@ int kasan_populate_early_shadow(const void *shadow_start, #ifndef kasan_mem_to_shadow static inline void *kasan_mem_to_shadow(const void *addr) { - return (void *)((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT) - + KASAN_SHADOW_OFFSET; + void *scaled; + + if (IS_ENABLED(CONFIG_KASAN_GENERIC)) + scaled = (void *)((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT); + else + scaled = (void *)((long)addr >> KASAN_SHADOW_SCALE_SHIFT); + + return KASAN_SHADOW_OFFSET + scaled; } #endif diff --git a/mm/kasan/report.c b/mm/kasan/report.c index b48c768acc84..c08097715686 100644 --- a/mm/kasan/report.c +++ b/mm/kasan/report.c @@ -644,15 +644,29 @@ void kasan_report_async(void) */ void kasan_non_canonical_hook(unsigned long addr) { + unsigned long max_shadow_size = BIT(BITS_PER_LONG - KASAN_SHADOW_SCALE_SHIFT); unsigned long orig_addr; const char *bug_type; /* - * All addresses that came as a result of the memory-to-shadow mapping - * (even for bogus pointers) must be >= KASAN_SHADOW_OFFSET. + * With the default kasan_mem_to_shadow() algorithm, all addresses + * returned by the memory-to-shadow mapping (even for bogus pointers) + * must be within a certain displacement from KASAN_SHADOW_OFFSET. + * + * For Generic KASAN, the displacement is unsigned, so + * KASAN_SHADOW_OFFSET is the smallest possible shadow address. For + * Software Tag-Based KASAN, the displacement is signed, so + * KASAN_SHADOW_OFFSET is the center of the range. */ - if (addr < KASAN_SHADOW_OFFSET) - return; + if (IS_ENABLED(CONFIG_KASAN_GENERIC)) { + if (addr < KASAN_SHADOW_OFFSET || + addr >= KASAN_SHADOW_OFFSET + max_shadow_size) + return; + } else { + if (addr < KASAN_SHADOW_OFFSET - max_shadow_size / 2 || + addr >= KASAN_SHADOW_OFFSET + max_shadow_size / 2) + return; + } orig_addr = (unsigned long)kasan_shadow_to_mem((void *)addr); diff --git a/scripts/gdb/linux/mm.py b/scripts/gdb/linux/mm.py index 7571aebbe650..2e63f3dedd53 100644 --- a/scripts/gdb/linux/mm.py +++ b/scripts/gdb/linux/mm.py @@ -110,12 +110,13 @@ class aarch64_page_ops(): self.KERNEL_END = gdb.parse_and_eval("_end") if constants.LX_CONFIG_KASAN_GENERIC or constants.LX_CONFIG_KASAN_SW_TAGS: + self.KASAN_SHADOW_OFFSET = constants.LX_CONFIG_KASAN_SHADOW_OFFSET if constants.LX_CONFIG_KASAN_GENERIC: self.KASAN_SHADOW_SCALE_SHIFT = 3 + self.KASAN_SHADOW_END = (1 << (64 - self.KASAN_SHADOW_SCALE_SHIFT)) + self.KASAN_SHADOW_OFFSET else: self.KASAN_SHADOW_SCALE_SHIFT = 4 - self.KASAN_SHADOW_OFFSET = constants.LX_CONFIG_KASAN_SHADOW_OFFSET - self.KASAN_SHADOW_END = (1 << (64 - self.KASAN_SHADOW_SCALE_SHIFT)) + self.KASAN_SHADOW_OFFSET + self.KASAN_SHADOW_END = self.KASAN_SHADOW_OFFSET self.PAGE_END = self.KASAN_SHADOW_END - (1 << (self.vabits_actual - self.KASAN_SHADOW_SCALE_SHIFT)) else: self.PAGE_END = self._PAGE_END(self.VA_BITS_MIN)
Currently, kasan_mem_to_shadow() uses a logical right shift, which turns canonical kernel addresses into non-canonical addresses by clearing the high KASAN_SHADOW_SCALE_SHIFT bits. The value of KASAN_SHADOW_OFFSET is then chosen so that the addition results in a canonical address for the shadow memory. For KASAN_GENERIC, this shift/add combination is ABI with the compiler, because KASAN_SHADOW_OFFSET is used in compiler-generated inline tag checks[1], which must only attempt to dereference canonical addresses. However, for KASAN_SW_TAGS we have some freedom to change the algorithm without breaking the ABI. Because TBI is enabled for kernel addresses, the top bits of shadow memory addresses computed during tag checks are irrelevant, and so likewise are the top bits of KASAN_SHADOW_OFFSET. This is demonstrated by the fact that LLVM uses a logical right shift in the tag check fast path[2] but a sbfx (signed bitfield extract) instruction in the slow path[3] without causing any issues. Using an arithmetic shift in kasan_mem_to_shadow() provides a number of benefits: 1) The memory layout is easier to understand. KASAN_SHADOW_OFFSET becomes a canonical memory address, and the shifted pointer becomes a negative offset, so KASAN_SHADOW_OFFSET == KASAN_SHADOW_END regardless of the shift amount or the size of the virtual address space. 2) KASAN_SHADOW_OFFSET becomes a simpler constant, requiring only one instruction to load instead of two. Since it must be loaded in each function with a tag check, this decreases kernel text size by 0.5%. 3) This shift and the sign extension from kasan_reset_tag() can be combined into a single sbfx instruction. When this same algorithm change is applied to the compiler, it removes an instruction from each inline tag check, further reducing kernel text size by an additional 4.6%. These benefits extend to other architectures as well. On RISC-V, where the baseline ISA does not shifted addition or have an equivalent to the sbfx instruction, loading KASAN_SHADOW_OFFSET is reduced from 3 to 2 instructions, and kasan_mem_to_shadow(kasan_reset_tag(addr)) similarly combines two consecutive right shifts. Link: https://github.com/llvm/llvm-project/blob/llvmorg-20-init/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp#L1316 [1] Link: https://github.com/llvm/llvm-project/blob/llvmorg-20-init/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp#L895 [2] Link: https://github.com/llvm/llvm-project/blob/llvmorg-20-init/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp#L669 [3] Signed-off-by: Samuel Holland <samuel.holland@sifive.com> --- Changes in v2: - Improve the explanation for how KASAN_SHADOW_END is derived - Update the range check in kasan_non_canonical_hook() arch/arm64/Kconfig | 10 +++++----- arch/arm64/include/asm/memory.h | 17 +++++++++++++++-- arch/arm64/mm/kasan_init.c | 7 +++++-- include/linux/kasan.h | 10 ++++++++-- mm/kasan/report.c | 22 ++++++++++++++++++---- scripts/gdb/linux/mm.py | 5 +++-- 6 files changed, 54 insertions(+), 17 deletions(-)