[v2,13/33] kmsan: Introduce memset_no_sanitize_memory()

Message ID	20231121220155.1217090-14-iii@linux.ibm.com (mailing list archive)
State	Superseded
Headers	show Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="SxOEgF2k" From: Ilya Leoshkevich <iii@linux.ibm.com> To: Alexander Gordeev <agordeev@linux.ibm.com>, Alexander Potapenko <glider@google.com>, Andrew Morton <akpm@linux-foundation.org>, Christoph Lameter <cl@linux.com>, David Rientjes <rientjes@google.com>, Heiko Carstens <hca@linux.ibm.com>, Joonsoo Kim <iamjoonsoo.kim@lge.com>, Marco Elver <elver@google.com>, Masami Hiramatsu <mhiramat@kernel.org>, Pekka Enberg <penberg@kernel.org>, Steven Rostedt <rostedt@goodmis.org>, Vasily Gorbik <gor@linux.ibm.com>, Vlastimil Babka <vbabka@suse.cz> Cc: Christian Borntraeger <borntraeger@linux.ibm.com>, Dmitry Vyukov <dvyukov@google.com>, Hyeonggon Yoo <42.hyeyoo@gmail.com>, kasan-dev@googlegroups.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, linux-trace-kernel@vger.kernel.org, Mark Rutland <mark.rutland@arm.com>, Roman Gushchin <roman.gushchin@linux.dev>, Sven Schnelle <svens@linux.ibm.com>, Ilya Leoshkevich <iii@linux.ibm.com> Subject: [PATCH v2 13/33] kmsan: Introduce memset_no_sanitize_memory() Date: Tue, 21 Nov 2023 23:01:07 +0100 Message-ID: <20231121220155.1217090-14-iii@linux.ibm.com> In-Reply-To: <20231121220155.1217090-1-iii@linux.ibm.com> References: <20231121220155.1217090-1-iii@linux.ibm.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	kmsan: Enable on s390 \| expand [v2,00/33] kmsan: Enable on s390 [v2,01/33] ftrace: Unpoison ftrace_regs in ftrace_ops_list_func() [v2,02/33] kmsan: Make the tests compatible with kmsan.panic=1 [v2,03/33] kmsan: Disable KMSAN when DEFERRED_STRUCT_PAGE_INIT is enabled [v2,04/33] kmsan: Increase the maximum store size to 4096 [v2,05/33] kmsan: Fix is_bad_asm_addr() on arches with overlapping address spaces [v2,06/33] kmsan: Fix kmsan_copy_to_user() on arches with overlapping address spaces [v2,07/33] kmsan: Remove a useless assignment from kmsan_vmap_pages_range_noflush() [v2,08/33] kmsan: Remove an x86-specific #include from kmsan.h [v2,09/33] kmsan: Introduce kmsan_memmove_metadata() [v2,10/33] kmsan: Expose kmsan_get_metadata() [v2,11/33] kmsan: Export panic_on_kmsan [v2,12/33] kmsan: Allow disabling KMSAN checks for the current task [v2,13/33] kmsan: Introduce memset_no_sanitize_memory() [v2,14/33] kmsan: Support SLAB_POISON [v2,15/33] kmsan: Use ALIGN_DOWN() in kmsan_get_metadata() [v2,16/33] mm: slub: Let KMSAN access metadata [v2,17/33] mm: kfence: Disable KMSAN when checking the canary [v2,18/33] lib/string: Add KMSAN support to strlcpy() and strlcat() [v2,19/33] lib/zlib: Unpoison DFLTCC output buffers [v2,20/33] kmsan: Accept ranges starting with 0 on s390 [v2,21/33] s390: Turn off KMSAN for boot, vdso and purgatory [v2,22/33] s390: Use a larger stack for KMSAN [v2,23/33] s390/boot: Add the KMSAN runtime stub [v2,24/33] s390/checksum: Add a KMSAN check [v2,25/33] s390/cpacf: Unpoison the results of cpacf_trng() [v2,26/33] s390/ftrace: Unpoison ftrace_regs in kprobe_ftrace_handler() [v2,27/33] s390/mm: Define KMSAN metadata for vmalloc and modules [v2,28/33] s390/string: Add KMSAN support [v2,29/33] s390/traps: Unpoison the kernel_stack_overflow()'s pt_regs [v2,30/33] s390/uaccess: Add KMSAN support to put_user() and get_user() [v2,31/33] s390/unwind: Disable KMSAN checks [v2,32/33] s390: Implement the architecture-specific kmsan functions [v2,33/33] kmsan: Enable on s390

Ilya Leoshkevich Nov. 21, 2023, 10:01 p.m. UTC

Add a wrapper for memset() that prevents unpoisoning. This is useful
for filling memory allocator redzones.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 include/linux/kmsan.h | 9 +++++++++
 1 file changed, 9 insertions(+)

Alexander Potapenko Dec. 8, 2023, 1:48 p.m. UTC | #1

On Tue, Nov 21, 2023 at 11:06 PM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> Add a wrapper for memset() that prevents unpoisoning.

We have __memset() already, won't it work for this case?
On the other hand, I am not sure you want to preserve the redzone in
its previous state (unless it's known to be poisoned).
You might consider explicitly unpoisoning the redzone instead.

...

> +__no_sanitize_memory
> +static inline void *memset_no_sanitize_memory(void *s, int c, size_t n)
> +{
> +       return memset(s, c, n);
> +}

I think depending on the compiler optimizations this might end up
being a call to normal memset, that would still change the shadow
bytes.

Ilya Leoshkevich Dec. 8, 2023, 2:07 p.m. UTC | #2

On Fri, 2023-12-08 at 14:48 +0100, Alexander Potapenko wrote:
> On Tue, Nov 21, 2023 at 11:06 PM Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> > 
> > Add a wrapper for memset() that prevents unpoisoning.
> 
> We have __memset() already, won't it work for this case?

A problem with __memset() is that, at least for me, it always ends
up being a call. There is a use case where we need to write only 1
byte, so I thought that introducing a call there (when compiling
without KMSAN) would be unacceptable.

> On the other hand, I am not sure you want to preserve the redzone in
> its previous state (unless it's known to be poisoned).

That's exactly the problem with unpoisoning: it removes the distinction
between a new allocation and a UAF.

> You might consider explicitly unpoisoning the redzone instead.

That was my first attempt, but it resulted in test failures due to the
above.

> ...
> 
> > +__no_sanitize_memory
> > +static inline void *memset_no_sanitize_memory(void *s, int c,
> > size_t n)
> > +{
> > +       return memset(s, c, n);
> > +}
> 
> I think depending on the compiler optimizations this might end up
> being a call to normal memset, that would still change the shadow
> bytes.

Interesting, do you have some specific scenario in mind? I vaguely
remember that in the past there were cases when sanitizer annotations
were lost after inlining, but I thought they were sorted out?

And, in any case, if this were to happen, would not it be considered a
compiler bug that needs fixing there, and not in the kernel?

Alexander Potapenko Dec. 8, 2023, 3:25 p.m. UTC | #3

> A problem with __memset() is that, at least for me, it always ends
> up being a call. There is a use case where we need to write only 1
> byte, so I thought that introducing a call there (when compiling
> without KMSAN) would be unacceptable.

Wonder what happens with that use case if we e.g. build with fortify-source.
Calling memset() for a single byte might be indicating the code is not hot.

> > ...
> >
> > > +__no_sanitize_memory
> > > +static inline void *memset_no_sanitize_memory(void *s, int c,
> > > size_t n)
> > > +{
> > > +       return memset(s, c, n);
> > > +}
> >
> > I think depending on the compiler optimizations this might end up
> > being a call to normal memset, that would still change the shadow
> > bytes.
>
> Interesting, do you have some specific scenario in mind? I vaguely
> remember that in the past there were cases when sanitizer annotations
> were lost after inlining, but I thought they were sorted out?

Sanitizer annotations are indeed lost after inlining, and we cannot do
much about that.
They are implemented using function attributes, and if a function
dissolves after inlining, we cannot possibly know which instructions
belonged to it.

Consider the following example (also available at
https://godbolt.org/z/5r7817G8e):

==================================
void *kmalloc(int size);

__attribute__((no_sanitize("kernel-memory")))
__attribute__((always_inline))
static void *memset_nosanitize(void *s, int c, int n) {
  return __builtin_memset(s, c, n);
}

void *do_something_nosanitize(int size) {
  void *ptr = kmalloc(size);
  memset_nosanitize(ptr, 0, size);
  return ptr;
}

void *do_something_sanitize(int size) {
  void *ptr = kmalloc(size);
  __builtin_memset(ptr, 0, size);
  return ptr;
}
==================================

If memset_nosanitize() has __attribute__((always_inline)), the
compiler generates the same LLVM IR calling __msan_memset() for both
do_something_nosanitize() and do_something_sanitize().
If we comment out this attribute, do_something_nosanitize() calls
memset_nosanitize(), which doesn't have the sanitize_memory attribute.

But even now __builtin_memset() is still calling __msan_memset(),
because __attribute__((no_sanitize("kernel-memory"))) somewhat
counterintuitively still preserves some instrumentation (see
include/linux/compiler-clang.h for details).
Replacing __attribute__((no_sanitize("kernel-memory"))) with
__attribute__((disable_sanitizer_instrumentation)) fixes this
situation:

define internal fastcc noundef ptr @memset_nosanitize(void*, int,
int)(ptr noundef returned writeonly %s, i32 noundef %n) unnamed_addr
#2 {
entry:
%conv = sext i32 %n to i64
tail call void @llvm.memset.p0.i64(ptr align 1 %s, i8 0, i64 %conv, i1 false)
ret ptr %s
}

>
> And, in any case, if this were to happen, would not it be considered a
> compiler bug that needs fixing there, and not in the kernel?

As stated above, I don't think this is more or less working as intended.
If we really want the ability to inline __memset(), we could transform
it into memset() in non-sanitizer builds, but perhaps having a call is
also acceptable?

Ilya Leoshkevich Dec. 13, 2023, 1:31 a.m. UTC | #4

On Fri, 2023-12-08 at 16:25 +0100, Alexander Potapenko wrote:
> > A problem with __memset() is that, at least for me, it always ends
> > up being a call. There is a use case where we need to write only 1
> > byte, so I thought that introducing a call there (when compiling
> > without KMSAN) would be unacceptable.
> 
> Wonder what happens with that use case if we e.g. build with fortify-
> source.
> Calling memset() for a single byte might be indicating the code is
> not hot.

The original code has a simple assignment. Here is the relevant diff:

        if (s->flags & __OBJECT_POISON) {
-               memset(p, POISON_FREE, poison_size - 1);
-               p[poison_size - 1] = POISON_END;
+               memset_no_sanitize_memory(p, POISON_FREE, poison_size -
1);
+               memset_no_sanitize_memory(p + poison_size - 1,
POISON_END, 1);
        }

[...]


> As stated above, I don't think this is more or less working as
> intended.
> If we really want the ability to inline __memset(), we could
> transform
> it into memset() in non-sanitizer builds, but perhaps having a call
> is
> also acceptable?

Thanks for the detailed explanation and analysis. I will post
a version with a __memset() and let the slab maintainers decide if
the additional overhead is acceptable.

Ilya Leoshkevich Dec. 13, 2023, 11:32 a.m. UTC | #5

On Wed, 2023-12-13 at 02:31 +0100, Ilya Leoshkevich wrote:
> On Fri, 2023-12-08 at 16:25 +0100, Alexander Potapenko wrote:
> > > A problem with __memset() is that, at least for me, it always
> > > ends
> > > up being a call. There is a use case where we need to write only
> > > 1
> > > byte, so I thought that introducing a call there (when compiling
> > > without KMSAN) would be unacceptable.

[...]

> > As stated above, I don't think this is more or less working as
> > intended.
> > If we really want the ability to inline __memset(), we could
> > transform
> > it into memset() in non-sanitizer builds, but perhaps having a call
> > is
> > also acceptable?
> 
> Thanks for the detailed explanation and analysis. I will post
> a version with a __memset() and let the slab maintainers decide if
> the additional overhead is acceptable.

I noticed I had the same problem in the get_user()/put_user() and
check_canary() patches.

The annotation being silently ignored is never what a programmer
intends, so what do you think about adding noinline to
__no_kmsan_checks and __no_sanitize_memory?

[v2,13/33] kmsan: Introduce memset_no_sanitize_memory()

Commit Message

Comments

Patch