Message ID | 1531999889-18343-1-git-send-email-alex.popov@linux.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Jul 19, 2018 at 4:31 AM, Alexander Popov <alex.popov@linux.com> wrote: > Introduce CONFIG_STACKLEAK_RUNTIME_DISABLE option, which provides > 'stack_erasing_bypass' sysctl. It can be used in runtime to disable > kernel stack erasing for kernels built with CONFIG_GCC_PLUGIN_STACKLEAK. > Stack erasing will then remain disabled and STACKLEAK_METRICS will not > be updated until the next boot. > > Signed-off-by: Alexander Popov <alex.popov@linux.com> > [...] > +That erasing reduces the information which kernel stack leak bugs > +can reveal and blocks some uninitialized stack variable attacks. > +The tradeoff is the performance impact: on a single CPU system kernel > +compilation sees a 1% slowdown, other systems and workloads may vary. I continue to have a hard time measuring even the 1% impact. Clearly I need some better workloads. :) > [...] > asmlinkage void stackleak_erase(void) > { > /* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */ > @@ -22,6 +52,11 @@ asmlinkage void stackleak_erase(void) > unsigned int poison_count = 0; > const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long); > > +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE > + if (static_branch_unlikely(&stack_erasing_bypass)) > + return; > +#endif I collapsed this into a macro (and took your other fix) and will push this to my -next tree: +#define skip_erasing() static_branch_unlikely(&stack_erasing_bypass) +#else +#define skip_erasing() false +#endif /* CONFIG_STACKLEAK_RUNTIME_DISABLE */ ... + if (skip_erasing()) + return; + > + > /* Search for the poison value in the kernel stack */ > while (kstack_ptr > boundary && poison_count <= depth) { > if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON) > @@ -78,6 +113,11 @@ void __used stackleak_track_stack(void) > */ > unsigned long sp = (unsigned long)&sp; > > +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE > + if (static_branch_unlikely(&stack_erasing_bypass)) > + return; > +#endif I would expect stackleak_erase() to be the expensive part, not the tracking part? Shouldn't timings be unchanged by leaving this in unconditionally, which would mean the sysctl could be re-enabled? -Kees
On 25.07.2018 01:56, Kees Cook wrote: > On Thu, Jul 19, 2018 at 4:31 AM, Alexander Popov <alex.popov@linux.com> wrote: >> Introduce CONFIG_STACKLEAK_RUNTIME_DISABLE option, which provides >> 'stack_erasing_bypass' sysctl. It can be used in runtime to disable >> kernel stack erasing for kernels built with CONFIG_GCC_PLUGIN_STACKLEAK. >> Stack erasing will then remain disabled and STACKLEAK_METRICS will not >> be updated until the next boot. >> >> Signed-off-by: Alexander Popov <alex.popov@linux.com> >> [...] >> +That erasing reduces the information which kernel stack leak bugs >> +can reveal and blocks some uninitialized stack variable attacks. >> +The tradeoff is the performance impact: on a single CPU system kernel >> +compilation sees a 1% slowdown, other systems and workloads may vary. > > I continue to have a hard time measuring even the 1% impact. Clearly I > need some better workloads. :) > >> [...] >> asmlinkage void stackleak_erase(void) >> { >> /* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */ >> @@ -22,6 +52,11 @@ asmlinkage void stackleak_erase(void) >> unsigned int poison_count = 0; >> const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long); >> >> +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE >> + if (static_branch_unlikely(&stack_erasing_bypass)) >> + return; >> +#endif > > I collapsed this into a macro (and took your other fix) and will push > this to my -next tree: > > +#define skip_erasing() static_branch_unlikely(&stack_erasing_bypass) > +#else > +#define skip_erasing() false > +#endif /* CONFIG_STACKLEAK_RUNTIME_DISABLE */ > ... > + if (skip_erasing()) > + return; > + That's nice! Thank you, I'll test it tomorrow. >> + >> /* Search for the poison value in the kernel stack */ >> while (kstack_ptr > boundary && poison_count <= depth) { >> if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON) >> @@ -78,6 +113,11 @@ void __used stackleak_track_stack(void) >> */ >> unsigned long sp = (unsigned long)&sp; >> >> +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE >> + if (static_branch_unlikely(&stack_erasing_bypass)) >> + return; >> +#endif > > I would expect stackleak_erase() to be the expensive part, not the > tracking part? Shouldn't timings be unchanged by leaving this in > unconditionally, which would mean the sysctl could be re-enabled? Dropping the bypass in stackleak_track_stack() will not help against the troubles with re-enabling stack erasing (tracking and erasing depend on each other). Moreover, it will also make the STACKLEAK_METRICS show insane values. So I think we should have the bypass in both functions. Best regards, Alexander
On Tue, Jul 24, 2018 at 4:41 PM, Alexander Popov <alex.popov@linux.com> wrote: > On 25.07.2018 01:56, Kees Cook wrote: >> On Thu, Jul 19, 2018 at 4:31 AM, Alexander Popov <alex.popov@linux.com> wrote: >>> Introduce CONFIG_STACKLEAK_RUNTIME_DISABLE option, which provides >>> 'stack_erasing_bypass' sysctl. It can be used in runtime to disable >>> kernel stack erasing for kernels built with CONFIG_GCC_PLUGIN_STACKLEAK. >>> Stack erasing will then remain disabled and STACKLEAK_METRICS will not >>> be updated until the next boot. >>> >>> Signed-off-by: Alexander Popov <alex.popov@linux.com> >>> [...] >>> +That erasing reduces the information which kernel stack leak bugs >>> +can reveal and blocks some uninitialized stack variable attacks. >>> +The tradeoff is the performance impact: on a single CPU system kernel >>> +compilation sees a 1% slowdown, other systems and workloads may vary. >> >> I continue to have a hard time measuring even the 1% impact. Clearly I >> need some better workloads. :) >> >>> [...] >>> asmlinkage void stackleak_erase(void) >>> { >>> /* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */ >>> @@ -22,6 +52,11 @@ asmlinkage void stackleak_erase(void) >>> unsigned int poison_count = 0; >>> const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long); >>> >>> +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE >>> + if (static_branch_unlikely(&stack_erasing_bypass)) >>> + return; >>> +#endif >> >> I collapsed this into a macro (and took your other fix) and will push >> this to my -next tree: >> >> +#define skip_erasing() static_branch_unlikely(&stack_erasing_bypass) >> +#else >> +#define skip_erasing() false >> +#endif /* CONFIG_STACKLEAK_RUNTIME_DISABLE */ >> ... >> + if (skip_erasing()) >> + return; >> + > > That's nice! Thank you, I'll test it tomorrow. > >>> + >>> /* Search for the poison value in the kernel stack */ >>> while (kstack_ptr > boundary && poison_count <= depth) { >>> if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON) >>> @@ -78,6 +113,11 @@ void __used stackleak_track_stack(void) >>> */ >>> unsigned long sp = (unsigned long)&sp; >>> >>> +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE >>> + if (static_branch_unlikely(&stack_erasing_bypass)) >>> + return; >>> +#endif >> >> I would expect stackleak_erase() to be the expensive part, not the >> tracking part? Shouldn't timings be unchanged by leaving this in >> unconditionally, which would mean the sysctl could be re-enabled? > > Dropping the bypass in stackleak_track_stack() will not help against the > troubles with re-enabling stack erasing (tracking and erasing depend on each Isn't the tracking checking "sp < current->lowest_stack", so if erasure was off, lowest_stack would only ever get further into the stack? And when erasure was turned back on, it would start getting reset correctly again. Or is the concern the poison searching could break? It seems like it would still work right? I must be missing something. :) > other). Moreover, it will also make the STACKLEAK_METRICS show insane values. So > I think we should have the bypass in both functions. I left it as-is for now. It should appear in -next tomorrow. Thanks! -Kees
On 25.07.2018 02:59, Kees Cook wrote: > On Tue, Jul 24, 2018 at 4:41 PM, Alexander Popov <alex.popov@linux.com> wrote: >> On 25.07.2018 01:56, Kees Cook wrote: >>> On Thu, Jul 19, 2018 at 4:31 AM, Alexander Popov <alex.popov@linux.com> wrote: >>>> @@ -78,6 +113,11 @@ void __used stackleak_track_stack(void) >>>> */ >>>> unsigned long sp = (unsigned long)&sp; >>>> >>>> +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE >>>> + if (static_branch_unlikely(&stack_erasing_bypass)) >>>> + return; >>>> +#endif >>> >>> I would expect stackleak_erase() to be the expensive part, not the >>> tracking part? Shouldn't timings be unchanged by leaving this in >>> unconditionally, which would mean the sysctl could be re-enabled? >> >> Dropping the bypass in stackleak_track_stack() will not help against the >> troubles with re-enabling stack erasing (tracking and erasing depend on each > > Isn't the tracking checking "sp < current->lowest_stack", so if > erasure was off, lowest_stack would only ever get further into the > stack? And when erasure was turned back on, it would start getting > reset correctly again. Or is the concern the poison searching could > break? It seems like it would still work right? I must be missing > something. :) Umm.. You are right, that would be a solution. Let's assume that we: - allow stackleak_track_stack() to work, - skip stackleak_erase() giving most of performance penalty. When we enable the 'stack_erasing_bypass', the 'lowest_stack' is not reset at the end of syscall, it just continues to go down at next syscalls (because of enabled tracking). In some sense it is similar to having a very long syscall. Now if we re-enable erasing, the poison search in stackleak_erase() starts from the _valid_ 'lowest_stack', which should work fine. I'll send the improved version of the patch soon. Thanks! Best regards, Alexander
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index eded671d..63b7493 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -87,6 +87,7 @@ show up in /proc/sys/kernel: - shmmni - softlockup_all_cpu_backtrace - soft_watchdog +- stack_erasing_bypass - stop-a [ SPARC only ] - sysrq ==> Documentation/admin-guide/sysrq.rst - sysctl_writes_strict @@ -962,6 +963,24 @@ detect a hard lockup condition. ============================================================== +stack_erasing_bypass + +This parameter can be used to disable kernel stack erasing at the end +of syscalls for kernels built with CONFIG_GCC_PLUGIN_STACKLEAK. + +That erasing reduces the information which kernel stack leak bugs +can reveal and blocks some uninitialized stack variable attacks. +The tradeoff is the performance impact: on a single CPU system kernel +compilation sees a 1% slowdown, other systems and workloads may vary. + + 0: do nothing - stack erasing is enabled by default. + + 1: enable stack erasing bypass - stack erasing will then remain + disabled and STACKLEAK_METRICS will not be updated until the + next boot. + +============================================================== + tainted: Non-zero if the kernel has been tainted. Numeric values, which can be diff --git a/include/linux/stackleak.h b/include/linux/stackleak.h index b911b97..e1fc3d1 100644 --- a/include/linux/stackleak.h +++ b/include/linux/stackleak.h @@ -22,6 +22,12 @@ static inline void stackleak_task_init(struct task_struct *t) t->prev_lowest_stack = t->lowest_stack; # endif } + +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE +int stack_erasing_bypass_sysctl(struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, loff_t *ppos); +#endif + #else /* !CONFIG_GCC_PLUGIN_STACKLEAK */ static inline void stackleak_task_init(struct task_struct *t) { } #endif diff --git a/kernel/stackleak.c b/kernel/stackleak.c index f5c4111..f731c9a 100644 --- a/kernel/stackleak.c +++ b/kernel/stackleak.c @@ -14,6 +14,36 @@ #include <linux/stackleak.h> +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE +#include <linux/jump_label.h> + +static DEFINE_STATIC_KEY_FALSE(stack_erasing_bypass); + +int stack_erasing_bypass_sysctl(struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, loff_t *ppos) +{ + int ret = 0; + int state = static_branch_unlikely(&stack_erasing_bypass); + + table->data = &state; + table->maxlen = sizeof(int); + ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); + if (ret || !write) + return ret; + + /* Stack erasing re-enabling is not supported */ + if (static_branch_unlikely(&stack_erasing_bypass)) + return -EOPNOTSUPP; + + if (state) { + static_branch_enable(&stack_erasing_bypass); + pr_warn("stackleak: stack erasing is disabled until reboot\n"); + } + + return ret; +} +#endif /* CONFIG_STACKLEAK_RUNTIME_DISABLE */ + asmlinkage void stackleak_erase(void) { /* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */ @@ -22,6 +52,11 @@ asmlinkage void stackleak_erase(void) unsigned int poison_count = 0; const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long); +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE + if (static_branch_unlikely(&stack_erasing_bypass)) + return; +#endif + /* Search for the poison value in the kernel stack */ while (kstack_ptr > boundary && poison_count <= depth) { if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON) @@ -78,6 +113,11 @@ void __used stackleak_track_stack(void) */ unsigned long sp = (unsigned long)&sp; +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE + if (static_branch_unlikely(&stack_erasing_bypass)) + return; +#endif + /* * Having CONFIG_STACKLEAK_TRACK_MIN_SIZE larger than * STACKLEAK_SEARCH_DEPTH makes the poison search in diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 2d9837c..0ac25ca 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -91,7 +91,9 @@ #ifdef CONFIG_CHR_DEV_SG #include <scsi/sg.h> #endif - +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE +#include <linux/stackleak.h> +#endif #ifdef CONFIG_LOCKUP_DETECTOR #include <linux/nmi.h> #endif @@ -1230,6 +1232,17 @@ static struct ctl_table kern_table[] = { .extra2 = &one, }, #endif +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE + { + .procname = "stack_erasing_bypass", + .data = NULL, + .maxlen = sizeof(int), + .mode = 0600, + .proc_handler = stack_erasing_bypass_sysctl, + .extra1 = &zero, + .extra2 = &one, + }, +#endif { } }; diff --git a/scripts/gcc-plugins/Kconfig b/scripts/gcc-plugins/Kconfig index 292161d..0028945 100644 --- a/scripts/gcc-plugins/Kconfig +++ b/scripts/gcc-plugins/Kconfig @@ -182,4 +182,14 @@ config STACKLEAK_METRICS can be useful for estimating the STACKLEAK performance impact for your workloads. +config STACKLEAK_RUNTIME_DISABLE + bool "Allow runtime disabling of kernel stack erasing" + depends on GCC_PLUGIN_STACKLEAK + help + This option provides 'stack_erasing_bypass' sysctl, which can be + used in runtime to disable kernel stack erasing for kernels built + with CONFIG_GCC_PLUGIN_STACKLEAK. Stack erasing will then remain + disabled and STACKLEAK_METRICS will not be updated until the + next boot. + endif
Introduce CONFIG_STACKLEAK_RUNTIME_DISABLE option, which provides 'stack_erasing_bypass' sysctl. It can be used in runtime to disable kernel stack erasing for kernels built with CONFIG_GCC_PLUGIN_STACKLEAK. Stack erasing will then remain disabled and STACKLEAK_METRICS will not be updated until the next boot. Signed-off-by: Alexander Popov <alex.popov@linux.com> --- Documentation/sysctl/kernel.txt | 19 +++++++++++++++++++ include/linux/stackleak.h | 6 ++++++ kernel/stackleak.c | 40 ++++++++++++++++++++++++++++++++++++++++ kernel/sysctl.c | 15 ++++++++++++++- scripts/gcc-plugins/Kconfig | 10 ++++++++++ 5 files changed, 89 insertions(+), 1 deletion(-)