Message ID | 20170724133824.27223-4-LiljestrandH@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Jul 24, 2017 at 6:38 AM, Hans Liljestrand <liljestrandh@gmail.com> wrote: > This adds actual implementation for mpxk-wrapper functions. The wrapper > function are used by the instrumentation to update and check pointer bounds > on functions that alter memory, e.g. kmalloc and memcpy. The kmalloc > wrapper function for instance simply executes kmalloc, associates bounds > with the returned pointer, and returns both. Other wrapper functions, such > as for memcpy, also check the bounds of incoming arguments. > > For future work these wrappers could potentially be replaced by direct > instrumentation without the need to incur the cost of calling the wrapper > function. In this scenario every kmalloc would simply be preceded by an > appropriate mkbnd instruction, and memcpy preceded by bndcu+bndcl > instructions. > > The wrappers are added by the MPXK gcc-plugin, and as such work on > preprocessed code. This introduces another problem with our > implementation since macros might actually be used to direct "base > functions" into specific implementations (e.g. memcpy might be a macro > pointing to __memcpy). One solution is covering all possibilities, but > this might introduce unwanted code bloat. I'd be curious to see how (if?) this interacts with CONFIG_FORTIFY. It seems that currently the MPXK checks would be similar to the inter-object checks as they exist (e.g. checking the size of the whole kmalloc allocation), but it wouldn't work on intra-object copies (i.e. bounds checking a field within an object). More directly, I'd be curious to see coverage and performance comparisons between FORTIFY and MPXK. -Kees
On Mon, Jul 24, 2017 at 07:45:21PM -0700, Kees Cook wrote: >On Mon, Jul 24, 2017 at 6:38 AM, Hans Liljestrand ><liljestrandh@gmail.com> wrote: >> This adds actual implementation for mpxk-wrapper functions. The wrapper >> function are used by the instrumentation to update and check pointer bounds >> on functions that alter memory, e.g. kmalloc and memcpy. The kmalloc >> wrapper function for instance simply executes kmalloc, associates bounds >> with the returned pointer, and returns both. Other wrapper functions, such >> as for memcpy, also check the bounds of incoming arguments. >> >> For future work these wrappers could potentially be replaced by direct >> instrumentation without the need to incur the cost of calling the wrapper >> function. In this scenario every kmalloc would simply be preceded by an >> appropriate mkbnd instruction, and memcpy preceded by bndcu+bndcl >> instructions. >> >> The wrappers are added by the MPXK gcc-plugin, and as such work on >> preprocessed code. This introduces another problem with our >> implementation since macros might actually be used to direct "base >> functions" into specific implementations (e.g. memcpy might be a macro >> pointing to __memcpy). One solution is covering all possibilities, but >> this might introduce unwanted code bloat. > >I'd be curious to see how (if?) this interacts with CONFIG_FORTIFY. It >seems that currently the MPXK checks would be similar to the >inter-object checks as they exist (e.g. checking the size of the whole >kmalloc allocation), but it wouldn't work on intra-object copies (i.e. >bounds checking a field within an object). Do you mean the FORTIFY_SOURCE thing (could not find anything directly on CONFIG_FORTIFY). Based on a quick look FORTIFY_SOURCE is localized, and essentially just provides the equivalent os our "wrappers", but I could be wildly off? MPXK on the other hand also checks other pointer dereferences and propagates pointer bounds into non-wrapper functions. The wrappers themselves also do not load any bounds, instead the calling function passes in the pointer bounds via the bound registers (although the caller might, depending on the situation have loaded the bounds based on the kmalloc allocation). Regarding inter-objects thing MPXK is a bit fuzzy, which is one of the weaknesses of our approach in general. The mpxk_load_bounds functions uses the whole kmalloc allocation, but MPX also uses narrowing, which (in some cases) narrows pointer bounds to intra-object pointers (excluding first fields and pointers into arrays, which all retain the bounds of the whole structure). The narrowing is provided by vanilla GCC MPX instrumentation, not specific to our MPXK code. As a guarantee we therefore cannot say more than that we enforce the whole kmalloc area (provided kmalloc was used). > >More directly, I'd be curious to see coverage and performance >comparisons between FORTIFY and MPXK. Based on the above we have seen KASAN as the more direct comparison, or would you disagree? In the any case, we would also like to get some actual comparisons done. Can you recommend any "standard" way of doing measurements, either coverage or performance, for this kind of protections? Thanks, -hans > >-Kees > >-- >Kees Cook >Pixel Security
On Tue, Jul 25, 2017 at 12:52 AM, Hans Liljestrand <liljestrandh@gmail.com> wrote: > Do you mean the FORTIFY_SOURCE thing (could not find anything directly on > CONFIG_FORTIFY). Based on a quick look FORTIFY_SOURCE is localized, and > essentially just provides the equivalent os our "wrappers", but I could be > wildly off? Yup, I mean this: https://git.kernel.org/linus/6974f0c4555e285ab217cee58b6e874f776ff409 > MPXK on the other hand also checks other pointer dereferences and propagates > pointer bounds into non-wrapper functions. The wrappers themselves also do > not load any bounds, instead the calling function passes in the pointer > bounds via the bound registers (although the caller might, depending on the > situation have loaded the bounds based on the kmalloc allocation). What are the code patterns that MPXK protects? It sounds like I'm not quite seeing what MPXK provides beyond the CONFIG_FORTIFY_SOURCE checks (both need to know the bounds already...) > Can you recommend any "standard" way of doing measurements, either coverage > or performance, for this kind of protections? I say, for build coverage (possibly via some compile-time warning), how many, e.g., memcpy()s are protected (i.e. bounds are known). And then similarly, during runtime (possibly via kprobe) how many memcpy()s are protected? For performance, it's trickier, and needs a meaningful workload. I've used hackbench and kernel build times... -Kees
On Tue, Jul 25, 2017 at 11:22:02AM -0700, Kees Cook wrote: >On Tue, Jul 25, 2017 at 12:52 AM, Hans Liljestrand ><liljestrandh@gmail.com> wrote: >> Do you mean the FORTIFY_SOURCE thing (could not find anything directly on >> CONFIG_FORTIFY). Based on a quick look FORTIFY_SOURCE is localized, and >> essentially just provides the equivalent os our "wrappers", but I could be >> wildly off? > >Yup, I mean this: > >https://git.kernel.org/linus/6974f0c4555e285ab217cee58b6e874f776ff409 Okay, thanks! > >> MPXK on the other hand also checks other pointer dereferences and propagates >> pointer bounds into non-wrapper functions. The wrappers themselves also do >> not load any bounds, instead the calling function passes in the pointer >> bounds via the bound registers (although the caller might, depending on the >> situation have loaded the bounds based on the kmalloc allocation). > >What are the code patterns that MPXK protects? It sounds like I'm not >quite seeing what MPXK provides beyond the CONFIG_FORTIFY_SOURCE >checks (both need to know the bounds already...) From what I can tell FORTIFY needs to know the during compile time and is only applied on specific function. This is not the case with MPXK, which works more like KASAN in that it enforces runtime bounds for any pointer access. As a concrete example of the runtime instrumentation: noinline void try_write(char *ptr, int i) { ptr[i] = '\0'; } becomes (where the bnd0 is set by the caller along with function args): 0000000000000000 <try_write>: 0: e8 00 00 00 00 callq 5 <try_write+0x5> 5: 4c 8d 54 24 08 lea 0x8(%rsp),%r10 a: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp e: 48 63 f6 movslq %esi,%rsi 11: 48 01 f7 add %rsi,%rdi 14: 41 ff 72 f8 pushq -0x8(%r10) 18: 55 push %rbp 19: 48 89 e5 mov %rsp,%rbp 1c: 41 52 push %r10 1e: 48 83 ec 08 sub $0x8,%rsp 22: f3 0f 1a 07 bndcl (%rdi),%bnd0 # <- bound checks 26: f2 0f 1a 07 bndcu (%rdi),%bnd0 # <- bound checks 2a: c6 07 00 movb $0x0,(%rdi) 2d: 48 83 c4 08 add $0x8,%rsp 31: 41 5a pop %r10 33: 5d pop %rbp 34: 49 8d 62 f8 lea -0x8(%r10),%rsp 38: f2 c3 bnd retq 3a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) This applies to any MPXK compiled function and pointer dereference, not only wrappers. The caller of course needs to know (at runtime) the bounds, which were set by the instrumentation either compile-time or runtime, potentially in the kmalloc wrapper (based on the size argument and returned via the bnd0 register). The memcpy & friends are wrapped so the checks can be done explicitly before letting the functions do their thing. The wrappers serve, IIUC, a similar purpose to the KASAN's direct modifications to the same functions. KASAN uses shadow memory to keep track of valid memory regions and then checks pointer dereferences against that. MPXK by contrast tracks the bounds along in-memory (either on stack or in .data section) or in MPX hardware bound registers (using regular MPX GCC instrumentation). This is not possible when pointers are stashed away in a shared array for example, so in those cases the bounds are stored elsewhere. Vanilla MPX uses a separate data structure for this, whereas MPXK attempts to determine bounds based on the kmalloc allocation area without explicit bound stores, shadow memory or fat-pointers. The main difference to KASAN/vanilla-MPX is that MPXK does not use any extra memory and we therefore hope MPXK will be more feasible to actually use runtime. In support of that idea MPXK is also designed to be modular, i.e. you can just apply it on specific parts of the code (hence also wrappers instead of KASAN-like direct changes to memcpy, etc implementations). > >> Can you recommend any "standard" way of doing measurements, either coverage >> or performance, for this kind of protections? > >I say, for build coverage (possibly via some compile-time warning), >how many, e.g., memcpy()s are protected (i.e. bounds are known). And >then similarly, during runtime (possibly via kprobe) how many >memcpy()s are protected? Yes, this sounds good. Although it would not cover non-wrapped pointer dereferences. None the less, this should give us an estimate of how much our bound load limitations actually affects MPXK coverage. When considering only these functions I think our worst case should be similar to what CONFIG_FORTIFY already does. Thanks for the suggestion! > >For performance, it's trickier, and needs a meaningful workload. I've >used hackbench and kernel build times... The meaningful workload thing is further complicated because we haven't envisioned this for kernel-wide deployment. But I guess we will have to do something like this to get comparable numbers. I will have to take a look at hackbench. Thanks, -hans > >-Kees > >-- >Kees Cook >Pixel Security
diff --git a/arch/x86/lib/mpxk-wrappers.c b/arch/x86/lib/mpxk-wrappers.c new file mode 100644 index 000000000000..0b0fe235c90f --- /dev/null +++ b/arch/x86/lib/mpxk-wrappers.c @@ -0,0 +1,157 @@ +/* + * arch/x86/lib/mpxk-wrappers.h + * + * Copyright (C) 2017 Aalto University + */ +#include <linux/page-flags.h> +#include <linux/vmalloc.h> +#include <linux/slab.h> +#include <linux/mm.h> +#include <asm/page.h> +#include <asm/pgtable_64.h> + +#define MPXK_BUILTIN_DEF(r, f, ...) extern r f(__VA_ARGS__); +#define MPXK_WRAPPER_DEF(r, f, ...) extern r mpxk_wrapper_##f(__VA_ARGS__); + +#include "../../../../scripts/gcc-plugins/mpxk_builtins.def" + +#undef MPXK_WRAPPER_DEF +#undef MPXK_BUILTIN_DEF + +void *mpxk_wrapper_kmalloc(size_t s, gfp_t f) +{ + void *p = kmalloc(s, f); + + if (p) + return __bnd_set_ptr_bounds(p, s); + return __bnd_null_ptr_bounds(p); +} + +void *mpxk_wrapper_krealloc(void *p, size_t s, gfp_t f) +{ + if (!p) + return mpxk_wrapper_kmalloc(s, f); + + __bnd_chk_ptr_lbounds(p); + p = krealloc(p, s, f); + + if (p) + return __bnd_set_ptr_bounds(p, s); + return __bnd_null_ptr_bounds(p); +} + +void *mpxk_wrapper_memmove(void *d, const void *s, size_t c) +{ + if (c == 0) + return d; + + __bnd_chk_ptr_bounds(d, c); + __bnd_chk_ptr_bounds(s, c); + + return memmove(d, s, c); +} + +void *mpxk_wrapper_memcpy(void *d, const void *s, size_t c) +{ + if (c == 0) + return d; + + __bnd_chk_ptr_bounds(d, c); + __bnd_chk_ptr_bounds(s, c); + + return memcpy(d, s, c); +} + +/* Because the MPXK gcc-plugin works on preprocessed code it cannot properly + * handle macro expanded calls such as potential memcpy -> __memcpy changes. + * This probably needs to be solved in some cleaner way, and probably also + * affects other function besides memcpy. + */ +void *mpxk_wrapper___inline_memcpy(void *d, const void *s, size_t c) +{ + if (c == 0) + return d; + + __bnd_chk_ptr_bounds(d, c); + __bnd_chk_ptr_bounds(s, c); + + return __inline_memcpy(d, s, c); +} + +void *mpxk_wrapper___memcpy(void *d, const void *s, size_t c) +{ + if (c == 0) + return d; + + __bnd_chk_ptr_bounds(d, c); + __bnd_chk_ptr_bounds(s, c); + + return __memcpy(d, s, c); +} + +void *mpxk_wrapper_memset(void *s, int c, size_t l) +{ + if (l > 0) { + __bnd_chk_ptr_bounds(s, l); + memset(s, c, l); + } + return s; +} + +char *mpxk_wrapper_strcat(char *d, const char *s) +{ + size_t ds = mpxk_wrapper_strlen(d); + size_t ss = mpxk_wrapper_strlen(s); + + __bnd_chk_ptr_bounds(d, ds + ss + 1); + __bnd_chk_ptr_bounds(s, ss + 1); + + return strcat(d, s); +} + +char *mpxk_wrapper_strncat(char *d, const char *s, size_t c) +{ + size_t ds = mpxk_wrapper_strlen(d); + size_t ss = mpxk_wrapper_strnlen(s, c); + + __bnd_chk_ptr_bounds(d, ds + ss + 1); + __bnd_chk_ptr_bounds(s, (ss < c ? ss + 1 : ss)); + + return strncat(d, s, c); +} + +char *mpxk_wrapper_strcpy(char *d, const char *s) +{ + size_t ss = mpxk_wrapper_strlen(s); + + __bnd_chk_ptr_bounds(d, ss + 1); + __bnd_chk_ptr_bounds(s, ss + 1); + + return strcpy(d, s); +} + +char *mpxk_wrapper_strncpy(char *d, const char *s, size_t c) +{ + size_t ss = strnlen(s, c); + + __bnd_chk_ptr_bounds(d, c); + __bnd_chk_ptr_bounds(s, (ss < c ? ss + 1 : ss)); + + return strncpy(d, s, c); +} + +size_t mpxk_wrapper_strlen(const char *s) +{ + const size_t l = strlen(s); + + __bnd_chk_ptr_bounds(s, l + 1); + return l; +} + +size_t mpxk_wrapper_strnlen(const char *s, size_t c) +{ + const size_t l = strnlen(s, c); + + __bnd_chk_ptr_bounds(s, l + 1); + return l; +}