diff mbox

[v14,7/7] stackleak, sysctl: Allow runtime disabling of kernel stack erasing

Message ID 1531999889-18343-1-git-send-email-alex.popov@linux.com (mailing list archive)
State New, archived
Headers show

Commit Message

Alexander Popov July 19, 2018, 11:31 a.m. UTC
Introduce CONFIG_STACKLEAK_RUNTIME_DISABLE option, which provides
'stack_erasing_bypass' sysctl. It can be used in runtime to disable
kernel stack erasing for kernels built with CONFIG_GCC_PLUGIN_STACKLEAK.
Stack erasing will then remain disabled and STACKLEAK_METRICS will not
be updated until the next boot.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 Documentation/sysctl/kernel.txt | 19 +++++++++++++++++++
 include/linux/stackleak.h       |  6 ++++++
 kernel/stackleak.c              | 40 ++++++++++++++++++++++++++++++++++++++++
 kernel/sysctl.c                 | 15 ++++++++++++++-
 scripts/gcc-plugins/Kconfig     | 10 ++++++++++
 5 files changed, 89 insertions(+), 1 deletion(-)

Comments

Kees Cook July 24, 2018, 10:56 p.m. UTC | #1
On Thu, Jul 19, 2018 at 4:31 AM, Alexander Popov <alex.popov@linux.com> wrote:
> Introduce CONFIG_STACKLEAK_RUNTIME_DISABLE option, which provides
> 'stack_erasing_bypass' sysctl. It can be used in runtime to disable
> kernel stack erasing for kernels built with CONFIG_GCC_PLUGIN_STACKLEAK.
> Stack erasing will then remain disabled and STACKLEAK_METRICS will not
> be updated until the next boot.
>
> Signed-off-by: Alexander Popov <alex.popov@linux.com>
> [...]
> +That erasing reduces the information which kernel stack leak bugs
> +can reveal and blocks some uninitialized stack variable attacks.
> +The tradeoff is the performance impact: on a single CPU system kernel
> +compilation sees a 1% slowdown, other systems and workloads may vary.

I continue to have a hard time measuring even the 1% impact. Clearly I
need some better workloads. :)

> [...]
>  asmlinkage void stackleak_erase(void)
>  {
>         /* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */
> @@ -22,6 +52,11 @@ asmlinkage void stackleak_erase(void)
>         unsigned int poison_count = 0;
>         const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
>
> +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
> +       if (static_branch_unlikely(&stack_erasing_bypass))
> +               return;
> +#endif

I collapsed this into a macro (and took your other fix) and will push
this to my -next tree:

+#define skip_erasing() static_branch_unlikely(&stack_erasing_bypass)
+#else
+#define skip_erasing() false
+#endif /* CONFIG_STACKLEAK_RUNTIME_DISABLE */
...
+       if (skip_erasing())
+               return;
+

> +
>         /* Search for the poison value in the kernel stack */
>         while (kstack_ptr > boundary && poison_count <= depth) {
>                 if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON)
> @@ -78,6 +113,11 @@ void __used stackleak_track_stack(void)
>          */
>         unsigned long sp = (unsigned long)&sp;
>
> +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
> +       if (static_branch_unlikely(&stack_erasing_bypass))
> +               return;
> +#endif

I would expect stackleak_erase() to be the expensive part, not the
tracking part? Shouldn't timings be unchanged by leaving this in
unconditionally, which would mean the sysctl could be re-enabled?

-Kees
Alexander Popov July 24, 2018, 11:41 p.m. UTC | #2
On 25.07.2018 01:56, Kees Cook wrote:
> On Thu, Jul 19, 2018 at 4:31 AM, Alexander Popov <alex.popov@linux.com> wrote:
>> Introduce CONFIG_STACKLEAK_RUNTIME_DISABLE option, which provides
>> 'stack_erasing_bypass' sysctl. It can be used in runtime to disable
>> kernel stack erasing for kernels built with CONFIG_GCC_PLUGIN_STACKLEAK.
>> Stack erasing will then remain disabled and STACKLEAK_METRICS will not
>> be updated until the next boot.
>>
>> Signed-off-by: Alexander Popov <alex.popov@linux.com>
>> [...]
>> +That erasing reduces the information which kernel stack leak bugs
>> +can reveal and blocks some uninitialized stack variable attacks.
>> +The tradeoff is the performance impact: on a single CPU system kernel
>> +compilation sees a 1% slowdown, other systems and workloads may vary.
> 
> I continue to have a hard time measuring even the 1% impact. Clearly I
> need some better workloads. :)
> 
>> [...]
>>  asmlinkage void stackleak_erase(void)
>>  {
>>         /* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */
>> @@ -22,6 +52,11 @@ asmlinkage void stackleak_erase(void)
>>         unsigned int poison_count = 0;
>>         const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
>>
>> +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
>> +       if (static_branch_unlikely(&stack_erasing_bypass))
>> +               return;
>> +#endif
> 
> I collapsed this into a macro (and took your other fix) and will push
> this to my -next tree:
> 
> +#define skip_erasing() static_branch_unlikely(&stack_erasing_bypass)
> +#else
> +#define skip_erasing() false
> +#endif /* CONFIG_STACKLEAK_RUNTIME_DISABLE */
> ...
> +       if (skip_erasing())
> +               return;
> +

That's nice! Thank you, I'll test it tomorrow.

>> +
>>         /* Search for the poison value in the kernel stack */
>>         while (kstack_ptr > boundary && poison_count <= depth) {
>>                 if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON)
>> @@ -78,6 +113,11 @@ void __used stackleak_track_stack(void)
>>          */
>>         unsigned long sp = (unsigned long)&sp;
>>
>> +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
>> +       if (static_branch_unlikely(&stack_erasing_bypass))
>> +               return;
>> +#endif
> 
> I would expect stackleak_erase() to be the expensive part, not the
> tracking part? Shouldn't timings be unchanged by leaving this in
> unconditionally, which would mean the sysctl could be re-enabled?

Dropping the bypass in stackleak_track_stack() will not help against the
troubles with re-enabling stack erasing (tracking and erasing depend on each
other). Moreover, it will also make the STACKLEAK_METRICS show insane values. So
I think we should have the bypass in both functions.

Best regards,
Alexander
Kees Cook July 24, 2018, 11:59 p.m. UTC | #3
On Tue, Jul 24, 2018 at 4:41 PM, Alexander Popov <alex.popov@linux.com> wrote:
> On 25.07.2018 01:56, Kees Cook wrote:
>> On Thu, Jul 19, 2018 at 4:31 AM, Alexander Popov <alex.popov@linux.com> wrote:
>>> Introduce CONFIG_STACKLEAK_RUNTIME_DISABLE option, which provides
>>> 'stack_erasing_bypass' sysctl. It can be used in runtime to disable
>>> kernel stack erasing for kernels built with CONFIG_GCC_PLUGIN_STACKLEAK.
>>> Stack erasing will then remain disabled and STACKLEAK_METRICS will not
>>> be updated until the next boot.
>>>
>>> Signed-off-by: Alexander Popov <alex.popov@linux.com>
>>> [...]
>>> +That erasing reduces the information which kernel stack leak bugs
>>> +can reveal and blocks some uninitialized stack variable attacks.
>>> +The tradeoff is the performance impact: on a single CPU system kernel
>>> +compilation sees a 1% slowdown, other systems and workloads may vary.
>>
>> I continue to have a hard time measuring even the 1% impact. Clearly I
>> need some better workloads. :)
>>
>>> [...]
>>>  asmlinkage void stackleak_erase(void)
>>>  {
>>>         /* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */
>>> @@ -22,6 +52,11 @@ asmlinkage void stackleak_erase(void)
>>>         unsigned int poison_count = 0;
>>>         const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
>>>
>>> +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
>>> +       if (static_branch_unlikely(&stack_erasing_bypass))
>>> +               return;
>>> +#endif
>>
>> I collapsed this into a macro (and took your other fix) and will push
>> this to my -next tree:
>>
>> +#define skip_erasing() static_branch_unlikely(&stack_erasing_bypass)
>> +#else
>> +#define skip_erasing() false
>> +#endif /* CONFIG_STACKLEAK_RUNTIME_DISABLE */
>> ...
>> +       if (skip_erasing())
>> +               return;
>> +
>
> That's nice! Thank you, I'll test it tomorrow.
>
>>> +
>>>         /* Search for the poison value in the kernel stack */
>>>         while (kstack_ptr > boundary && poison_count <= depth) {
>>>                 if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON)
>>> @@ -78,6 +113,11 @@ void __used stackleak_track_stack(void)
>>>          */
>>>         unsigned long sp = (unsigned long)&sp;
>>>
>>> +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
>>> +       if (static_branch_unlikely(&stack_erasing_bypass))
>>> +               return;
>>> +#endif
>>
>> I would expect stackleak_erase() to be the expensive part, not the
>> tracking part? Shouldn't timings be unchanged by leaving this in
>> unconditionally, which would mean the sysctl could be re-enabled?
>
> Dropping the bypass in stackleak_track_stack() will not help against the
> troubles with re-enabling stack erasing (tracking and erasing depend on each

Isn't the tracking checking "sp < current->lowest_stack", so if
erasure was off, lowest_stack would only ever get further into the
stack? And when erasure was turned back on, it would start getting
reset correctly again. Or is the concern the poison searching could
break? It seems like it would still work right? I must be missing
something. :)

> other). Moreover, it will also make the STACKLEAK_METRICS show insane values. So
> I think we should have the bypass in both functions.

I left it as-is for now. It should appear in -next tomorrow.

Thanks!

-Kees
Alexander Popov July 26, 2018, 10:18 a.m. UTC | #4
On 25.07.2018 02:59, Kees Cook wrote:
> On Tue, Jul 24, 2018 at 4:41 PM, Alexander Popov <alex.popov@linux.com> wrote:
>> On 25.07.2018 01:56, Kees Cook wrote:
>>> On Thu, Jul 19, 2018 at 4:31 AM, Alexander Popov <alex.popov@linux.com> wrote:
>>>> @@ -78,6 +113,11 @@ void __used stackleak_track_stack(void)
>>>>          */
>>>>         unsigned long sp = (unsigned long)&sp;
>>>>
>>>> +#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
>>>> +       if (static_branch_unlikely(&stack_erasing_bypass))
>>>> +               return;
>>>> +#endif
>>>
>>> I would expect stackleak_erase() to be the expensive part, not the
>>> tracking part? Shouldn't timings be unchanged by leaving this in
>>> unconditionally, which would mean the sysctl could be re-enabled?
>>
>> Dropping the bypass in stackleak_track_stack() will not help against the
>> troubles with re-enabling stack erasing (tracking and erasing depend on each
> 
> Isn't the tracking checking "sp < current->lowest_stack", so if
> erasure was off, lowest_stack would only ever get further into the
> stack? And when erasure was turned back on, it would start getting
> reset correctly again. Or is the concern the poison searching could
> break? It seems like it would still work right? I must be missing
> something. :)

Umm.. You are right, that would be a solution. Let's assume that we:
 - allow stackleak_track_stack() to work,
 - skip stackleak_erase() giving most of performance penalty.
When we enable the 'stack_erasing_bypass', the 'lowest_stack' is not reset at
the end of syscall, it just continues to go down at next syscalls (because of
enabled tracking). In some sense it is similar to having a very long syscall.
Now if we re-enable erasing, the poison search in stackleak_erase() starts from
the _valid_ 'lowest_stack', which should work fine.

I'll send the improved version of the patch soon. Thanks!

Best regards,
Alexander
diff mbox

Patch

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index eded671d..63b7493 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -87,6 +87,7 @@  show up in /proc/sys/kernel:
 - shmmni
 - softlockup_all_cpu_backtrace
 - soft_watchdog
+- stack_erasing_bypass
 - stop-a                      [ SPARC only ]
 - sysrq                       ==> Documentation/admin-guide/sysrq.rst
 - sysctl_writes_strict
@@ -962,6 +963,24 @@  detect a hard lockup condition.
 
 ==============================================================
 
+stack_erasing_bypass
+
+This parameter can be used to disable kernel stack erasing at the end
+of syscalls for kernels built with CONFIG_GCC_PLUGIN_STACKLEAK.
+
+That erasing reduces the information which kernel stack leak bugs
+can reveal and blocks some uninitialized stack variable attacks.
+The tradeoff is the performance impact: on a single CPU system kernel
+compilation sees a 1% slowdown, other systems and workloads may vary.
+
+  0: do nothing - stack erasing is enabled by default.
+
+  1: enable stack erasing bypass - stack erasing will then remain
+     disabled and STACKLEAK_METRICS will not be updated until the
+     next boot.
+
+==============================================================
+
 tainted:
 
 Non-zero if the kernel has been tainted. Numeric values, which can be
diff --git a/include/linux/stackleak.h b/include/linux/stackleak.h
index b911b97..e1fc3d1 100644
--- a/include/linux/stackleak.h
+++ b/include/linux/stackleak.h
@@ -22,6 +22,12 @@  static inline void stackleak_task_init(struct task_struct *t)
 	t->prev_lowest_stack = t->lowest_stack;
 # endif
 }
+
+#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
+int stack_erasing_bypass_sysctl(struct ctl_table *table, int write,
+			void __user *buffer, size_t *lenp, loff_t *ppos);
+#endif
+
 #else /* !CONFIG_GCC_PLUGIN_STACKLEAK */
 static inline void stackleak_task_init(struct task_struct *t) { }
 #endif
diff --git a/kernel/stackleak.c b/kernel/stackleak.c
index f5c4111..f731c9a 100644
--- a/kernel/stackleak.c
+++ b/kernel/stackleak.c
@@ -14,6 +14,36 @@ 
 
 #include <linux/stackleak.h>
 
+#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
+#include <linux/jump_label.h>
+
+static DEFINE_STATIC_KEY_FALSE(stack_erasing_bypass);
+
+int stack_erasing_bypass_sysctl(struct ctl_table *table, int write,
+			void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	int ret = 0;
+	int state = static_branch_unlikely(&stack_erasing_bypass);
+
+	table->data = &state;
+	table->maxlen = sizeof(int);
+	ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+	if (ret || !write)
+		return ret;
+
+	/* Stack erasing re-enabling is not supported */
+	if (static_branch_unlikely(&stack_erasing_bypass))
+		return -EOPNOTSUPP;
+
+	if (state) {
+		static_branch_enable(&stack_erasing_bypass);
+		pr_warn("stackleak: stack erasing is disabled until reboot\n");
+	}
+
+	return ret;
+}
+#endif /* CONFIG_STACKLEAK_RUNTIME_DISABLE */
+
 asmlinkage void stackleak_erase(void)
 {
 	/* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */
@@ -22,6 +52,11 @@  asmlinkage void stackleak_erase(void)
 	unsigned int poison_count = 0;
 	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
 
+#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
+	if (static_branch_unlikely(&stack_erasing_bypass))
+		return;
+#endif
+
 	/* Search for the poison value in the kernel stack */
 	while (kstack_ptr > boundary && poison_count <= depth) {
 		if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON)
@@ -78,6 +113,11 @@  void __used stackleak_track_stack(void)
 	 */
 	unsigned long sp = (unsigned long)&sp;
 
+#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
+	if (static_branch_unlikely(&stack_erasing_bypass))
+		return;
+#endif
+
 	/*
 	 * Having CONFIG_STACKLEAK_TRACK_MIN_SIZE larger than
 	 * STACKLEAK_SEARCH_DEPTH makes the poison search in
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 2d9837c..0ac25ca 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -91,7 +91,9 @@ 
 #ifdef CONFIG_CHR_DEV_SG
 #include <scsi/sg.h>
 #endif
-
+#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
+#include <linux/stackleak.h>
+#endif
 #ifdef CONFIG_LOCKUP_DETECTOR
 #include <linux/nmi.h>
 #endif
@@ -1230,6 +1232,17 @@  static struct ctl_table kern_table[] = {
 		.extra2		= &one,
 	},
 #endif
+#ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
+	{
+		.procname	= "stack_erasing_bypass",
+		.data		= NULL,
+		.maxlen		= sizeof(int),
+		.mode		= 0600,
+		.proc_handler	= stack_erasing_bypass_sysctl,
+		.extra1		= &zero,
+		.extra2		= &one,
+	},
+#endif
 	{ }
 };
 
diff --git a/scripts/gcc-plugins/Kconfig b/scripts/gcc-plugins/Kconfig
index 292161d..0028945 100644
--- a/scripts/gcc-plugins/Kconfig
+++ b/scripts/gcc-plugins/Kconfig
@@ -182,4 +182,14 @@  config STACKLEAK_METRICS
 	  can be useful for estimating the STACKLEAK performance impact for
 	  your workloads.
 
+config STACKLEAK_RUNTIME_DISABLE
+	bool "Allow runtime disabling of kernel stack erasing"
+	depends on GCC_PLUGIN_STACKLEAK
+	help
+	  This option provides 'stack_erasing_bypass' sysctl, which can be
+	  used in runtime to disable kernel stack erasing for kernels built
+	  with CONFIG_GCC_PLUGIN_STACKLEAK. Stack erasing will then remain
+	  disabled and STACKLEAK_METRICS will not be updated until the
+	  next boot.
+
 endif