Message ID | 1500855620-73004-4-git-send-email-keescook@chromium.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
* Kees Cook <keescook@chromium.org> wrote: > +config ARCH_HAS_REFCOUNT > + bool > + help > + An architecture selects this when it has implemented refcount_t > + using primitizes that provide a faster runtime at the expense > + of some full refcount state checks. The refcount overflow condition, > + however, must be retained. Catching overflows is the primary > + security concern for protecting against bugs in reference counts. s/primitizes/primitives also, the 'faster runtime' and the whole explanation reads a bit weird to me, how about something like: An architecture selects this when it has implemented refcount_t using open coded assembly primitives that provide an optimized refcount_t implementation, possibly at the expense of some full refcount state checks of CONFIG_REFCOUNT_FULL=y. The refcount overflow check behavior, however, must be retained. Catching overflows is the primary security concern for protecting against bugs in reference counts. > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -55,6 +55,7 @@ config X86 > select ARCH_HAS_KCOV if X86_64 > select ARCH_HAS_MMIO_FLUSH > select ARCH_HAS_PMEM_API if X86_64 > + select ARCH_HAS_REFCOUNT > select ARCH_HAS_UACCESS_FLUSHCACHE if X86_64 > select ARCH_HAS_SET_MEMORY > select ARCH_HAS_SG_CHAIN Just wonderin, how was the 32-bit kernel tested? > +/* > + * Body of refcount error handling: in .text.unlikely, saved into CX the > + * address of the refcount that has entered a bad state, and trigger an > + * exception. Fixup address is back in regular execution flow in .text. I had to read this 4 times to parse it (and even now I'm unsure whether I parsed it correctly) - could this explanation be transformed to simpler, more straightforward English? > + */ > +#define _REFCOUNT_EXCEPTION \ > + ".pushsection .text.unlikely\n" \ > + "111:\tlea %[counter], %%" _ASM_CX "\n" \ > + "112:\t" ASM_UD0 "\n" \ > + ASM_UNREACHABLE \ > + ".popsection\n" \ > + "113:\n" \ > + _ASM_EXTABLE_REFCOUNT(112b, 113b) Would it be technically possible to use named labels instead of these random numbered labels? > + /* > + * This function has been called because either a negative refcount > + * value was seen by any of the refcount functions, or a zero > + * refcount value was seen by refcount_dec(). > + * > + * If we crossed from INT_MAX to INT_MIN, the OF flag (result > + * wrapped around) will be set. Additionally, seeing the refcount > + * reach 0 will set the ZF flag. In each of these cases we want a > + * report, since it's a boundary condition. Small nit: 'ZF' stands for 'zero flag' - so we should either write 'zero flag' or 'ZF' - 'ZF flag' is kind of redundant. > +#else > +static inline void refcount_error_report(struct pt_regs *regs, > + const char *msg) { } By now you should know that for x86 code you should not break lines in such an ugly fashion, right? :-) Thanks, Ingo
On Mon, Jul 24, 2017 at 2:07 AM, Ingo Molnar <mingo@kernel.org> wrote: > > * Kees Cook <keescook@chromium.org> wrote: > >> +config ARCH_HAS_REFCOUNT >> + bool >> + help >> + An architecture selects this when it has implemented refcount_t >> + using primitizes that provide a faster runtime at the expense >> + of some full refcount state checks. The refcount overflow condition, >> + however, must be retained. Catching overflows is the primary >> + security concern for protecting against bugs in reference counts. > > s/primitizes/primitives And, as an aside for anyone curious, I've just added this to my .vimrc: " Enable spell checking in Kconfig files autocmd BufEnter Kconfig* set spell " Enable spell checking in git commit edits autocmd BufEnter COMMIT_EDITMSG set spell I'm always forgetting to double-check the spellchecker mode, and in those files, it should just be enabled by default. :P > also, the 'faster runtime' and the whole explanation reads a bit weird to me, > how about something like: > > An architecture selects this when it has implemented refcount_t > using open coded assembly primitives that provide an optimized > refcount_t implementation, possibly at the expense of some full > refcount state checks of CONFIG_REFCOUNT_FULL=y. > > The refcount overflow check behavior, however, must be retained. > Catching overflows is the primary security concern for protecting > against bugs in reference counts. Yeah, this is much clearer, thanks! I've replaced this for v8. >> --- a/arch/x86/Kconfig >> +++ b/arch/x86/Kconfig >> @@ -55,6 +55,7 @@ config X86 >> select ARCH_HAS_KCOV if X86_64 >> select ARCH_HAS_MMIO_FLUSH >> select ARCH_HAS_PMEM_API if X86_64 >> + select ARCH_HAS_REFCOUNT >> select ARCH_HAS_UACCESS_FLUSHCACHE if X86_64 >> select ARCH_HAS_SET_MEMORY >> select ARCH_HAS_SG_CHAIN > > Just wonderin, how was the 32-bit kernel tested? I've got a qemu image for 32-bit too. Behavior and timing are similarly matched. I've got updates to LKDTM that provide test coverage of the refcount API. (I was going to send that via the drivers tree, with some other LKDTM updates.) If you want to see it, a not-quite-final version is here: https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=kspp/fast-refcount/ud/v6&id=5ebc36fd39a3c58bca770b81030626cf779103ee >> +/* >> + * Body of refcount error handling: in .text.unlikely, saved into CX the >> + * address of the refcount that has entered a bad state, and trigger an >> + * exception. Fixup address is back in regular execution flow in .text. > > I had to read this 4 times to parse it (and even now I'm unsure whether I parsed > it correctly) - could this explanation be transformed to simpler, more > straightforward English? I've rewritten this now; hopefully it will be easier to parse. > >> + */ >> +#define _REFCOUNT_EXCEPTION \ >> + ".pushsection .text.unlikely\n" \ >> + "111:\tlea %[counter], %%" _ASM_CX "\n" \ >> + "112:\t" ASM_UD0 "\n" \ >> + ASM_UNREACHABLE \ >> + ".popsection\n" \ >> + "113:\n" \ >> + _ASM_EXTABLE_REFCOUNT(112b, 113b) > > Would it be technically possible to use named labels instead of these random > numbered labels? My understanding was that using numbered labels allows us to have repeated labels in the same function. (i.e. NUMb and NUMf can't, I think, be done with text labels.) >> + /* >> + * This function has been called because either a negative refcount >> + * value was seen by any of the refcount functions, or a zero >> + * refcount value was seen by refcount_dec(). >> + * >> + * If we crossed from INT_MAX to INT_MIN, the OF flag (result >> + * wrapped around) will be set. Additionally, seeing the refcount >> + * reach 0 will set the ZF flag. In each of these cases we want a >> + * report, since it's a boundary condition. > > Small nit: 'ZF' stands for 'zero flag' - so we should either write 'zero flag' or > 'ZF' - 'ZF flag' is kind of redundant. True! I'll fix this (and the "OF flag" usage above there). >> +#else >> +static inline void refcount_error_report(struct pt_regs *regs, >> + const char *msg) { } > > By now you should know that for x86 code you should not break lines in such an > ugly fashion, right? :-) Yes, I agonized over this because I hated the look of each variation I tried here. I'll try again. :) Thanks for the review! -Kees
diff --git a/arch/Kconfig b/arch/Kconfig index 21d0089117fe..349185f951cd 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -931,6 +931,15 @@ config STRICT_MODULE_RWX config ARCH_WANT_RELAX_ORDER bool +config ARCH_HAS_REFCOUNT + bool + help + An architecture selects this when it has implemented refcount_t + using primitizes that provide a faster runtime at the expense + of some full refcount state checks. The refcount overflow condition, + however, must be retained. Catching overflows is the primary + security concern for protecting against bugs in reference counts. + config REFCOUNT_FULL bool "Perform full reference count validation at the expense of speed" help diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 781521b7cf9e..73574c91e857 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -55,6 +55,7 @@ config X86 select ARCH_HAS_KCOV if X86_64 select ARCH_HAS_MMIO_FLUSH select ARCH_HAS_PMEM_API if X86_64 + select ARCH_HAS_REFCOUNT select ARCH_HAS_UACCESS_FLUSHCACHE if X86_64 select ARCH_HAS_SET_MEMORY select ARCH_HAS_SG_CHAIN diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h index 7a9df3beb89b..676ee5807d86 100644 --- a/arch/x86/include/asm/asm.h +++ b/arch/x86/include/asm/asm.h @@ -74,6 +74,9 @@ # define _ASM_EXTABLE_EX(from, to) \ _ASM_EXTABLE_HANDLE(from, to, ex_handler_ext) +# define _ASM_EXTABLE_REFCOUNT(from, to) \ + _ASM_EXTABLE_HANDLE(from, to, ex_handler_refcount) + # define _ASM_NOKPROBE(entry) \ .pushsection "_kprobe_blacklist","aw" ; \ _ASM_ALIGN ; \ @@ -123,6 +126,9 @@ # define _ASM_EXTABLE_EX(from, to) \ _ASM_EXTABLE_HANDLE(from, to, ex_handler_ext) +# define _ASM_EXTABLE_REFCOUNT(from, to) \ + _ASM_EXTABLE_HANDLE(from, to, ex_handler_refcount) + /* For C file, we already have NOKPROBE_SYMBOL macro */ #endif diff --git a/arch/x86/include/asm/refcount.h b/arch/x86/include/asm/refcount.h new file mode 100644 index 000000000000..3c4c0576c048 --- /dev/null +++ b/arch/x86/include/asm/refcount.h @@ -0,0 +1,113 @@ +#ifndef __ASM_X86_REFCOUNT_H +#define __ASM_X86_REFCOUNT_H +/* + * x86-specific implementation of refcount_t. Based on PAX_REFCOUNT from + * PaX/grsecurity. + */ +#include <linux/refcount.h> + +/* + * Body of refcount error handling: in .text.unlikely, saved into CX the + * address of the refcount that has entered a bad state, and trigger an + * exception. Fixup address is back in regular execution flow in .text. + */ +#define _REFCOUNT_EXCEPTION \ + ".pushsection .text.unlikely\n" \ + "111:\tlea %[counter], %%" _ASM_CX "\n" \ + "112:\t" ASM_UD0 "\n" \ + ASM_UNREACHABLE \ + ".popsection\n" \ + "113:\n" \ + _ASM_EXTABLE_REFCOUNT(112b, 113b) + +/* Trigger refcount exception if refcount result is negative. */ +#define REFCOUNT_CHECK_LT_ZERO \ + "js 111f\n\t" \ + _REFCOUNT_EXCEPTION + +/* Trigger refcount exception if refcount result is zero or negative. */ +#define REFCOUNT_CHECK_LE_ZERO \ + "jz 111f\n\t" \ + REFCOUNT_CHECK_LT_ZERO + +static __always_inline void refcount_add(unsigned int i, refcount_t *r) +{ + asm volatile(LOCK_PREFIX "addl %1,%0\n\t" + REFCOUNT_CHECK_LT_ZERO + : [counter] "+m" (r->refs.counter) + : "ir" (i) + : "cc", "cx"); +} + +static __always_inline void refcount_inc(refcount_t *r) +{ + asm volatile(LOCK_PREFIX "incl %0\n\t" + REFCOUNT_CHECK_LT_ZERO + : [counter] "+m" (r->refs.counter) + : : "cc", "cx"); +} + +static __always_inline void refcount_dec(refcount_t *r) +{ + asm volatile(LOCK_PREFIX "decl %0\n\t" + REFCOUNT_CHECK_LE_ZERO + : [counter] "+m" (r->refs.counter) + : : "cc", "cx"); +} + +static __always_inline __must_check +bool refcount_sub_and_test(unsigned int i, refcount_t *r) +{ + GEN_BINARY_SUFFIXED_RMWcc(LOCK_PREFIX "subl", REFCOUNT_CHECK_LT_ZERO, + r->refs.counter, "er", i, "%0", e); +} + +static __always_inline __must_check bool refcount_dec_and_test(refcount_t *r) +{ + GEN_UNARY_SUFFIXED_RMWcc(LOCK_PREFIX "decl", REFCOUNT_CHECK_LT_ZERO, + r->refs.counter, "%0", e); +} + +/** + * __refcount_add_unless - add unless the number is already a given value + * @r: pointer of type refcount_t + * @a: the amount to add to v... + * @u: ...unless v is equal to u. + * + * Atomically adds @a to @r, so long as @r was not already @u. + * Returns the old value of @r. + */ +static __always_inline __must_check +int __refcount_add_unless(refcount_t *r, int a, int u) +{ + int c, new; + + c = atomic_read(&(r->refs)); + do { + if (unlikely(c == u)) + break; + + asm volatile("addl %2,%0\n\t" + REFCOUNT_CHECK_LT_ZERO + : "=r" (new) + : "0" (c), "ir" (a), + [counter] "m" (r->refs.counter) + : "cc", "cx"); + + } while (!atomic_try_cmpxchg(&(r->refs), &c, new)); + + return c; +} + +static __always_inline __must_check +bool refcount_add_not_zero(unsigned int i, refcount_t *r) +{ + return __refcount_add_unless(r, i, 0) != 0; +} + +static __always_inline __must_check bool refcount_inc_not_zero(refcount_t *r) +{ + return refcount_add_not_zero(1, r); +} + +#endif diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c index 0ea8afcb929c..956075fb3d59 100644 --- a/arch/x86/mm/extable.c +++ b/arch/x86/mm/extable.c @@ -36,6 +36,48 @@ bool ex_handler_fault(const struct exception_table_entry *fixup, } EXPORT_SYMBOL_GPL(ex_handler_fault); +/* + * Handler for UD0 exception following a failed test against the + * result of a refcount inc/dec/add/sub. + */ +bool ex_handler_refcount(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr) +{ + /* First unconditionally saturate the refcount. */ + *(int *)regs->cx = INT_MIN / 2; + + /* + * Strictly speaking, this reports the fixup destination, not + * the fault location, and not the actually overflowing + * instruction, which is the instruction before the "js", but + * since that instruction could be a variety of lengths, just + * report the location after the overflow, which should be close + * enough for finding the overflow, as it's at least back in + * the function, having returned from .text.unlikely. + */ + regs->ip = ex_fixup_addr(fixup); + + /* + * This function has been called because either a negative refcount + * value was seen by any of the refcount functions, or a zero + * refcount value was seen by refcount_dec(). + * + * If we crossed from INT_MAX to INT_MIN, the OF flag (result + * wrapped around) will be set. Additionally, seeing the refcount + * reach 0 will set the ZF flag. In each of these cases we want a + * report, since it's a boundary condition. + * + */ + if (regs->flags & (X86_EFLAGS_OF | X86_EFLAGS_ZF)) { + bool zero = regs->flags & X86_EFLAGS_ZF; + + refcount_error_report(regs, zero ? "hit zero" : "overflow"); + } + + return true; +} +EXPORT_SYMBOL_GPL(ex_handler_refcount); + bool ex_handler_ext(const struct exception_table_entry *fixup, struct pt_regs *regs, int trapnr) { diff --git a/include/linux/kernel.h b/include/linux/kernel.h index bd6d96cf80b1..87de25965a92 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -277,6 +277,13 @@ extern int oops_may_print(void); void do_exit(long error_code) __noreturn; void complete_and_exit(struct completion *, long) __noreturn; +#ifdef CONFIG_ARCH_HAS_REFCOUNT +void refcount_error_report(struct pt_regs *regs, const char *msg); +#else +static inline void refcount_error_report(struct pt_regs *regs, + const char *msg) { } +#endif + /* Internal, do not use. */ int __must_check _kstrtoul(const char *s, unsigned int base, unsigned long *res); int __must_check _kstrtol(const char *s, unsigned int base, long *res); diff --git a/include/linux/refcount.h b/include/linux/refcount.h index 591792c8e5b0..48b7c9c68c4d 100644 --- a/include/linux/refcount.h +++ b/include/linux/refcount.h @@ -53,6 +53,9 @@ extern __must_check bool refcount_sub_and_test(unsigned int i, refcount_t *r); extern __must_check bool refcount_dec_and_test(refcount_t *r); extern void refcount_dec(refcount_t *r); #else +# ifdef CONFIG_ARCH_HAS_REFCOUNT +# include <asm/refcount.h> +# else static inline __must_check bool refcount_add_not_zero(unsigned int i, refcount_t *r) { return atomic_add_unless(&r->refs, i, 0); @@ -87,6 +90,7 @@ static inline void refcount_dec(refcount_t *r) { atomic_dec(&r->refs); } +# endif /* !CONFIG_ARCH_HAS_REFCOUNT */ #endif /* CONFIG_REFCOUNT_FULL */ extern __must_check bool refcount_dec_if_one(refcount_t *r); diff --git a/kernel/panic.c b/kernel/panic.c index a58932b41700..a33d6bc4b0ce 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -26,6 +26,7 @@ #include <linux/nmi.h> #include <linux/console.h> #include <linux/bug.h> +#include <linux/ratelimit.h> #define PANIC_TIMER_STEP 100 #define PANIC_BLINK_SPD 18 @@ -601,6 +602,17 @@ EXPORT_SYMBOL(__stack_chk_fail); #endif +#ifdef CONFIG_ARCH_HAS_REFCOUNT +void refcount_error_report(struct pt_regs *regs, const char *msg) +{ + WARN_RATELIMIT(1, "refcount %s detected at %pB in %s[%d], uid/euid: %u/%u\n", + msg, (void *)instruction_pointer(regs), + current->comm, task_pid_nr(current), + from_kuid_munged(&init_user_ns, current_uid()), + from_kuid_munged(&init_user_ns, current_euid())); +} +#endif + core_param(panic, panic_timeout, int, 0644); core_param(pause_on_oops, pause_on_oops, int, 0644); core_param(panic_on_warn, panic_on_warn, int, 0644);