diff mbox series

[v2,2/3] kfence: maximize allocation wait timeout duration

Message ID 20210421105132.3965998-3-elver@google.com (mailing list archive)
State New, archived
Headers show
Series kfence: optimize timer scheduling | expand

Commit Message

Marco Elver April 21, 2021, 10:51 a.m. UTC
The allocation wait timeout was initially added because of warnings due
to CONFIG_DETECT_HUNG_TASK=y [1]. While the 1 sec timeout is sufficient
to resolve the warnings (given the hung task timeout must be 1 sec or
larger) it may cause unnecessary wake-ups if the system is idle.
[1] https://lkml.kernel.org/r/CADYN=9J0DQhizAGB0-jz4HOBBh+05kMBXb4c0cXMS7Qi5NAJiw@mail.gmail.com

Fix it by computing the timeout duration in terms of the current
sysctl_hung_task_timeout_secs value.

Signed-off-by: Marco Elver <elver@google.com>
---
 mm/kfence/core.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Comments

Kefeng Wang Sept. 16, 2021, 1:02 a.m. UTC | #1
On 2021/4/21 18:51, Marco Elver wrote:
> The allocation wait timeout was initially added because of warnings due
> to CONFIG_DETECT_HUNG_TASK=y [1]. While the 1 sec timeout is sufficient
> to resolve the warnings (given the hung task timeout must be 1 sec or
> larger) it may cause unnecessary wake-ups if the system is idle.
> [1] https://lkml.kernel.org/r/CADYN=9J0DQhizAGB0-jz4HOBBh+05kMBXb4c0cXMS7Qi5NAJiw@mail.gmail.com
>
> Fix it by computing the timeout duration in terms of the current
> sysctl_hung_task_timeout_secs value.
>
> Signed-off-by: Marco Elver <elver@google.com>
> ---
>   mm/kfence/core.c | 12 +++++++++++-
>   1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
> index 235d726f88bc..9742649f3f88 100644
> --- a/mm/kfence/core.c
> +++ b/mm/kfence/core.c
> @@ -20,6 +20,7 @@
>   #include <linux/moduleparam.h>
>   #include <linux/random.h>
>   #include <linux/rcupdate.h>
> +#include <linux/sched/sysctl.h>
>   #include <linux/seq_file.h>
>   #include <linux/slab.h>
>   #include <linux/spinlock.h>
> @@ -621,7 +622,16 @@ static void toggle_allocation_gate(struct work_struct *work)
>   	/* Enable static key, and await allocation to happen. */
>   	static_branch_enable(&kfence_allocation_key);
>   
> -	wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate), HZ);
> +	if (sysctl_hung_task_timeout_secs) {
> +		/*
> +		 * During low activity with no allocations we might wait a
> +		 * while; let's avoid the hung task warning.
> +		 */
> +		wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate),
> +				   sysctl_hung_task_timeout_secs * HZ / 2);
> +	} else {
> +		wait_event(allocation_wait, atomic_read(&kfence_allocation_gate));
> +	}
>   
>   	/* Disable static key and reset timer. */
>   	static_branch_disable(&kfence_allocation_key);
Kefeng Wang Sept. 16, 2021, 1:20 a.m. UTC | #2
Hi Marco,

We found kfence_test will fails  on ARM64 with this patch with/without  
CONFIG_DETECT_HUNG_TASK,

Any thought ?


On 2021/9/16 9:02, Kefeng Wang wrote:
>
> On 2021/4/21 18:51, Marco Elver wrote:
>> The allocation wait timeout was initially added because of warnings due
>> to CONFIG_DETECT_HUNG_TASK=y [1]. While the 1 sec timeout is sufficient
>> to resolve the warnings (given the hung task timeout must be 1 sec or
>> larger) it may cause unnecessary wake-ups if the system is idle.
>> [1] 
>> https://lkml.kernel.org/r/CADYN=9J0DQhizAGB0-jz4HOBBh+05kMBXb4c0cXMS7Qi5NAJiw@mail.gmail.com
>>
>> Fix it by computing the timeout duration in terms of the current
>> sysctl_hung_task_timeout_secs value.
>>
>> Signed-off-by: Marco Elver <elver@google.com>
>> ---
>>   mm/kfence/core.c | 12 +++++++++++-
>>   1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
>> index 235d726f88bc..9742649f3f88 100644
>> --- a/mm/kfence/core.c
>> +++ b/mm/kfence/core.c
>> @@ -20,6 +20,7 @@
>>   #include <linux/moduleparam.h>
>>   #include <linux/random.h>
>>   #include <linux/rcupdate.h>
>> +#include <linux/sched/sysctl.h>
>>   #include <linux/seq_file.h>
>>   #include <linux/slab.h>
>>   #include <linux/spinlock.h>
>> @@ -621,7 +622,16 @@ static void toggle_allocation_gate(struct 
>> work_struct *work)
>>       /* Enable static key, and await allocation to happen. */
>>       static_branch_enable(&kfence_allocation_key);
>>   -    wait_event_timeout(allocation_wait, 
>> atomic_read(&kfence_allocation_gate), HZ);
>> +    if (sysctl_hung_task_timeout_secs) {
>> +        /*
>> +         * During low activity with no allocations we might wait a
>> +         * while; let's avoid the hung task warning.
>> +         */
>> +        wait_event_timeout(allocation_wait, 
>> atomic_read(&kfence_allocation_gate),
>> +                   sysctl_hung_task_timeout_secs * HZ / 2);
>> +    } else {
>> +        wait_event(allocation_wait, 
>> atomic_read(&kfence_allocation_gate));
>> +    }
>>         /* Disable static key and reset timer. */
>>       static_branch_disable(&kfence_allocation_key);
Marco Elver Sept. 16, 2021, 8:49 a.m. UTC | #3
On Thu, 16 Sept 2021 at 03:20, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> Hi Marco,
>
> We found kfence_test will fails  on ARM64 with this patch with/without
> CONFIG_DETECT_HUNG_TASK,
>
> Any thought ?

Please share log and instructions to reproduce if possible. Also, if
possible, please share bisection log that led you to this patch.

I currently do not see how this patch would cause that, it only
increases the timeout duration.

I know that under QEMU TCG mode, there are occasionally timeouts in
the test simply due to QEMU being extremely slow or other weirdness.
David Laight Sept. 16, 2021, 3:45 p.m. UTC | #4
From: Kefeng Wang
> Sent: 16 September 2021 02:21
> 
> We found kfence_test will fails  on ARM64 with this patch with/without
> CONFIG_DETECT_HUNG_TASK,
> 
> Any thought ?
> 
...
> >>       /* Enable static key, and await allocation to happen. */
> >>       static_branch_enable(&kfence_allocation_key);
> >>   -    wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate), HZ);
> >> +    if (sysctl_hung_task_timeout_secs) {
> >> +        /*
> >> +         * During low activity with no allocations we might wait a
> >> +         * while; let's avoid the hung task warning.
> >> +         */
> >> +        wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate),
> >> +                   sysctl_hung_task_timeout_secs * HZ / 2);
> >> +    } else {
> >> +        wait_event(allocation_wait, atomic_read(&kfence_allocation_gate));
> >> +    }
> >>         /* Disable static key and reset timer. */
> >>       static_branch_disable(&kfence_allocation_key);

It has replaced a wait_event_timeout() with a wait_event().

That probably isn't intended.
Although I'd expect their to be some test for the wait being
signalled or timing out.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Marco Elver Sept. 16, 2021, 3:48 p.m. UTC | #5
On Thu, 16 Sept 2021 at 17:45, David Laight <David.Laight@aculab.com> wrote:
>
> From: Kefeng Wang
> > Sent: 16 September 2021 02:21
> >
> > We found kfence_test will fails  on ARM64 with this patch with/without
> > CONFIG_DETECT_HUNG_TASK,
> >
> > Any thought ?
> >
> ...
> > >>       /* Enable static key, and await allocation to happen. */
> > >>       static_branch_enable(&kfence_allocation_key);
> > >>   -    wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate), HZ);
> > >> +    if (sysctl_hung_task_timeout_secs) {
> > >> +        /*
> > >> +         * During low activity with no allocations we might wait a
> > >> +         * while; let's avoid the hung task warning.
> > >> +         */
> > >> +        wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate),
> > >> +                   sysctl_hung_task_timeout_secs * HZ / 2);
> > >> +    } else {
> > >> +        wait_event(allocation_wait, atomic_read(&kfence_allocation_gate));
> > >> +    }
> > >>         /* Disable static key and reset timer. */
> > >>       static_branch_disable(&kfence_allocation_key);
>
> It has replaced a wait_event_timeout() with a wait_event().
>
> That probably isn't intended.
> Although I'd expect their to be some test for the wait being
> signalled or timing out.

It is intended -- there's a wake_up() for this. See the whole patch
series for explanation.

The whole reason we had the timeout was to avoid the hung task
warnings, but we can do better if there is no hung task warning
enabled.
Liu Shixin Sept. 18, 2021, 8:07 a.m. UTC | #6
On 2021/9/16 16:49, Marco Elver wrote:
> On Thu, 16 Sept 2021 at 03:20, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>> Hi Marco,
>>
>> We found kfence_test will fails  on ARM64 with this patch with/without
>> CONFIG_DETECT_HUNG_TASK,
>>
>> Any thought ?
> Please share log and instructions to reproduce if possible. Also, if
> possible, please share bisection log that led you to this patch.
>
> I currently do not see how this patch would cause that, it only
> increases the timeout duration.
>
> I know that under QEMU TCG mode, there are occasionally timeouts in
> the test simply due to QEMU being extremely slow or other weirdness.
>
> .
>
Hi Marco,

There are some of the results of the current test:
1. Using qemu-kvm on arm64 machine, all testcase can pass.
2. Using qemu-system-aarch64 on x86_64 machine, randomly some testcases fail.
3. Using qemu-system-aarch64 on x86_64, but removing the judgment of kfence_allocation_key in kfence_alloc(), all testcase can pass.

I add some printing to the kernel and get very strange results.
I add a new variable kfence_allocation_key_gate to track the
state of kfence_allocation_key. As shown in the following code, theoretically,
if kfence_allocation_key_gate is zero, then kfence_allocation_key must be
enabled, so the value of variable error in kfence_alloc() should always be
zero. In fact, all the passed testcases fit this point. But as shown in the
following failed log, although kfence_allocation_key has been enabled, it's
still check failed here.

So I think static_key might be problematic in my qemu environment.
The change of timeout is not a problem but caused us to observe this problem.
I tried changing the wait_event to a loop. I set timeout to HZ and re-enable/disabled
in each loop, then the failed testcase disappears.

[    3.463519]     # Subtest: kfence
[    3.463629]     1..25
[    3.465548]     # test_out_of_bounds_read: test_alloc: size=128, gfp=cc0, policy=left, cache=0
[    3.561001] kfence: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~enabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[    3.561934] kfence: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~disabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[    3.665449] kfence: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~enabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[   13.464796] --------------kfence_allocation_key check failed 13839286 times----------------
[   13.467482] kfence: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~disabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[   13.469166]     # test_out_of_bounds_read: ASSERTION FAILED at mm/kfence/kfence_test.c:308
[   13.469166]     Expected false to be true, but is false
[   13.469166]
[   13.469166] failed to allocate from KFENCE
[   13.473592]     not ok 1 - test_out_of_bounds_read


diff --git a/include/linux/kfence.h b/include/linux/kfence.h
index 3fe6dd8a18c1..e72889606e82 100644
--- a/include/linux/kfence.h
+++ b/include/linux/kfence.h
@@ -25,6 +25,7 @@ extern char *__kfence_pool;
 #ifdef CONFIG_KFENCE_STATIC_KEYS
 #include <linux/static_key.h>
 DECLARE_STATIC_KEY_FALSE(kfence_allocation_key);
+extern atomic_t kfence_allocation_key_gate;
 #else
 #include <linux/atomic.h>
 extern atomic_t kfence_allocation_gate;
@@ -116,12 +117,20 @@ void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags);
  */
 static __always_inline void *kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags)
 {
+       static int error;
 #ifdef CONFIG_KFENCE_STATIC_KEYS
-       if (static_branch_unlikely(&kfence_allocation_key))
+       if (static_branch_unlikely(&kfence_allocation_key)) {
 #else
-       if (unlikely(!atomic_read(&kfence_allocation_gate)))
+       if (unlikely(!atomic_read(&kfence_allocation_gate))) {
 #endif
+               if (error) {
+                       pr_info("--------------kfence_allocation_key check failed %d times----------------\n", error);
+                       error = 0;
+               }
                return __kfence_alloc(s, size, flags);
+       }
+       if (!atomic_read(&kfence_allocation_key_gate))
+               error++;
        return NULL;
 }
diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 7a97db8bc8e7..637c2efa6133 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -100,6 +100,7 @@ static DEFINE_RAW_SPINLOCK(kfence_freelist_lock); /* Lock protecting freelist. *
 #ifdef CONFIG_KFENCE_STATIC_KEYS
 /* The static key to set up a KFENCE allocation. */
 DEFINE_STATIC_KEY_FALSE(kfence_allocation_key);
+atomic_t kfence_allocation_key_gate = ATOMIC_INIT(1);
 #endif
 
 /* Gates the allocation, ensuring only one succeeds in a given period. */
@@ -624,7 +625,9 @@ static void toggle_allocation_gate(struct work_struct *work)
 #ifdef CONFIG_KFENCE_STATIC_KEYS
        /* Enable static key, and await allocation to happen. */
        static_branch_enable(&kfence_allocation_key);
-
+       if (static_branch_unlikely(&kfence_allocation_key))
+               pr_info("~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~enabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n");
+       atomic_set(&kfence_allocation_key_gate, 0);
        if (sysctl_hung_task_timeout_secs) {
                /*
                 * During low activity with no allocations we might wait a
@@ -637,7 +640,10 @@ static void toggle_allocation_gate(struct work_struct *work)
        }
 
        /* Disable static key and reset timer. */
+       atomic_set(&kfence_allocation_key_gate, 1);
        static_branch_disable(&kfence_allocation_key);
+       if (!static_branch_unlikely(&kfence_allocation_key))
+                       pr_info("~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~disabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n");
 #endif
        queue_delayed_work(system_unbound_wq, &kfence_timer,
                           msecs_to_jiffies(kfence_sample_interval));

thanks,
Marco Elver Sept. 18, 2021, 9:37 a.m. UTC | #7
On Sat, 18 Sept 2021 at 10:07, Liu Shixin <liushixin2@huawei.com> wrote:
>
> On 2021/9/16 16:49, Marco Elver wrote:
> > On Thu, 16 Sept 2021 at 03:20, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> >> Hi Marco,
> >>
> >> We found kfence_test will fails  on ARM64 with this patch with/without
> >> CONFIG_DETECT_HUNG_TASK,
> >>
> >> Any thought ?
> > Please share log and instructions to reproduce if possible. Also, if
> > possible, please share bisection log that led you to this patch.
> >
> > I currently do not see how this patch would cause that, it only
> > increases the timeout duration.
> >
> > I know that under QEMU TCG mode, there are occasionally timeouts in
> > the test simply due to QEMU being extremely slow or other weirdness.
> >
> >
> Hi Marco,
>
> There are some of the results of the current test:
> 1. Using qemu-kvm on arm64 machine, all testcase can pass.
> 2. Using qemu-system-aarch64 on x86_64 machine, randomly some testcases fail.
> 3. Using qemu-system-aarch64 on x86_64, but removing the judgment of kfence_allocation_key in kfence_alloc(), all testcase can pass.
>
> I add some printing to the kernel and get very strange results.
> I add a new variable kfence_allocation_key_gate to track the
> state of kfence_allocation_key. As shown in the following code, theoretically,
> if kfence_allocation_key_gate is zero, then kfence_allocation_key must be
> enabled, so the value of variable error in kfence_alloc() should always be
> zero. In fact, all the passed testcases fit this point. But as shown in the
> following failed log, although kfence_allocation_key has been enabled, it's
> still check failed here.
>
> So I think static_key might be problematic in my qemu environment.
> The change of timeout is not a problem but caused us to observe this problem.
> I tried changing the wait_event to a loop. I set timeout to HZ and re-enable/disabled
> in each loop, then the failed testcase disappears.

Nice analysis, thanks! What I gather is that static_keys/jump_labels
are somehow broken in QEMU.

This does remind me that I found a bug in QEMU that might be relevant:
https://bugs.launchpad.net/qemu/+bug/1920934
Looks like it was never fixed. :-/

The failures I encountered caused the kernel to crash, but never saw
the kfence test to fail due to that (never managed to get that far).
Though the bug I saw was on x86 TCG mode, and I never tried arm64. If
you can, try to build a QEMU with ASan and see if you also get the
same use-after-free bug.

Unless we observe the problem on a real machine, I think for now we
can conclude with fairly high confidence that QEMU TCG still has
issues and cannot be fully trusted here (see bug above).

Thanks,
-- Marco
Marco Elver Sept. 18, 2021, 9:45 a.m. UTC | #8
On Sat, 18 Sept 2021 at 11:37, Marco Elver <elver@google.com> wrote:
>
> On Sat, 18 Sept 2021 at 10:07, Liu Shixin <liushixin2@huawei.com> wrote:
> >
> > On 2021/9/16 16:49, Marco Elver wrote:
> > > On Thu, 16 Sept 2021 at 03:20, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> > >> Hi Marco,
> > >>
> > >> We found kfence_test will fails  on ARM64 with this patch with/without
> > >> CONFIG_DETECT_HUNG_TASK,
> > >>
> > >> Any thought ?
> > > Please share log and instructions to reproduce if possible. Also, if
> > > possible, please share bisection log that led you to this patch.
> > >
> > > I currently do not see how this patch would cause that, it only
> > > increases the timeout duration.
> > >
> > > I know that under QEMU TCG mode, there are occasionally timeouts in
> > > the test simply due to QEMU being extremely slow or other weirdness.
> > >
> > >
> > Hi Marco,
> >
> > There are some of the results of the current test:
> > 1. Using qemu-kvm on arm64 machine, all testcase can pass.
> > 2. Using qemu-system-aarch64 on x86_64 machine, randomly some testcases fail.
> > 3. Using qemu-system-aarch64 on x86_64, but removing the judgment of kfence_allocation_key in kfence_alloc(), all testcase can pass.
> >
> > I add some printing to the kernel and get very strange results.
> > I add a new variable kfence_allocation_key_gate to track the
> > state of kfence_allocation_key. As shown in the following code, theoretically,
> > if kfence_allocation_key_gate is zero, then kfence_allocation_key must be
> > enabled, so the value of variable error in kfence_alloc() should always be
> > zero. In fact, all the passed testcases fit this point. But as shown in the
> > following failed log, although kfence_allocation_key has been enabled, it's
> > still check failed here.
> >
> > So I think static_key might be problematic in my qemu environment.
> > The change of timeout is not a problem but caused us to observe this problem.
> > I tried changing the wait_event to a loop. I set timeout to HZ and re-enable/disabled
> > in each loop, then the failed testcase disappears.
>
> Nice analysis, thanks! What I gather is that static_keys/jump_labels
> are somehow broken in QEMU.
>
> This does remind me that I found a bug in QEMU that might be relevant:
> https://bugs.launchpad.net/qemu/+bug/1920934
> Looks like it was never fixed. :-/
>
> The failures I encountered caused the kernel to crash, but never saw
> the kfence test to fail due to that (never managed to get that far).
> Though the bug I saw was on x86 TCG mode, and I never tried arm64. If

[ ... that is, I didn't try running QEMU-ASan in arm64 TCG mode ... of
course I use QEMU arm64 to test. ;-) ]

> you can, try to build a QEMU with ASan and see if you also get the
> same use-after-free bug.
>
> Unless we observe the problem on a real machine, I think for now we
> can conclude with fairly high confidence that QEMU TCG still has
> issues and cannot be fully trusted here (see bug above).
>
> Thanks,
> -- Marco
diff mbox series

Patch

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 235d726f88bc..9742649f3f88 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -20,6 +20,7 @@ 
 #include <linux/moduleparam.h>
 #include <linux/random.h>
 #include <linux/rcupdate.h>
+#include <linux/sched/sysctl.h>
 #include <linux/seq_file.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
@@ -621,7 +622,16 @@  static void toggle_allocation_gate(struct work_struct *work)
 	/* Enable static key, and await allocation to happen. */
 	static_branch_enable(&kfence_allocation_key);
 
-	wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate), HZ);
+	if (sysctl_hung_task_timeout_secs) {
+		/*
+		 * During low activity with no allocations we might wait a
+		 * while; let's avoid the hung task warning.
+		 */
+		wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate),
+				   sysctl_hung_task_timeout_secs * HZ / 2);
+	} else {
+		wait_event(allocation_wait, atomic_read(&kfence_allocation_gate));
+	}
 
 	/* Disable static key and reset timer. */
 	static_branch_disable(&kfence_allocation_key);