mbox series

[0/3] kasan: memorize and print call_rcu stack

Message ID 20200506051853.14380-1-walter-zh.wu@mediatek.com (mailing list archive)
Headers show
Series kasan: memorize and print call_rcu stack | expand

Message

Walter Wu May 6, 2020, 5:18 a.m. UTC
This patchset improves KASAN reports by making them to have
call_rcu() call stack information. It is helpful for programmers
to solve use-after-free or double-free memory issue.

The KASAN report was as follows(cleaned up slightly):

BUG: KASAN: use-after-free in kasan_rcu_reclaim+0x58/0x60

Freed by task 0:
 save_stack+0x24/0x50
 __kasan_slab_free+0x110/0x178
 kasan_slab_free+0x10/0x18
 kfree+0x98/0x270
 kasan_rcu_reclaim+0x1c/0x60
 rcu_core+0x8b4/0x10f8
 rcu_core_si+0xc/0x18
 efi_header_end+0x238/0xa6c

First call_rcu() call stack:
 save_stack+0x24/0x50
 kasan_record_callrcu+0xc8/0xd8
 call_rcu+0x190/0x580
 kasan_rcu_uaf+0x1d8/0x278

Last call_rcu() call stack:
(stack is not available)


Add new CONFIG option to record first and last call_rcu() call stack
and KASAN report prints two call_rcu() call stack.

This option doesn't increase the cost of memory consumption. It is
only suitable for generic KASAN.

[1]https://bugzilla.kernel.org/show_bug.cgi?id=198437

Walter Wu (3):
rcu/kasan: record and print call_rcu() call stack
kasan: record and print the free track
kasan: add KASAN_RCU_STACK_RECORD documentation

Documentation/dev-tools/kasan.rst | 21 +++++++++++++++++++++
include/linux/kasan.h             |  7 +++++++
kernel/rcu/tree.c                 |  5 +++++
lib/Kconfig.kasan                 | 11 +++++++++++
mm/kasan/common.c                 | 31 +++++++++++++++++++++++++++++++
mm/kasan/kasan.h                  | 12 ++++++++++++
mm/kasan/report.c                 | 53 ++++++++++++++++++++++++++++++++++++++++++++++-------
7 files changed, 133 insertions(+), 7 deletions(-)

Comments

Qian Cai May 6, 2020, 5:53 a.m. UTC | #1
> On May 6, 2020, at 1:19 AM, Walter Wu <walter-zh.wu@mediatek.com> wrote:
> 
> This patchset improves KASAN reports by making them to have
> call_rcu() call stack information. It is helpful for programmers
> to solve use-after-free or double-free memory issue.
> 
> The KASAN report was as follows(cleaned up slightly):
> 
> BUG: KASAN: use-after-free in kasan_rcu_reclaim+0x58/0x60
> 
> Freed by task 0:
> save_stack+0x24/0x50
> __kasan_slab_free+0x110/0x178
> kasan_slab_free+0x10/0x18
> kfree+0x98/0x270
> kasan_rcu_reclaim+0x1c/0x60
> rcu_core+0x8b4/0x10f8
> rcu_core_si+0xc/0x18
> efi_header_end+0x238/0xa6c
> 
> First call_rcu() call stack:
> save_stack+0x24/0x50
> kasan_record_callrcu+0xc8/0xd8
> call_rcu+0x190/0x580
> kasan_rcu_uaf+0x1d8/0x278
> 
> Last call_rcu() call stack:
> (stack is not available)
> 
> 
> Add new CONFIG option to record first and last call_rcu() call stack
> and KASAN report prints two call_rcu() call stack.
> 
> This option doesn't increase the cost of memory consumption. It is
> only suitable for generic KASAN.

I don’t understand why this needs to be a Kconfig option at all. If call_rcu() stacks are useful in general, then just always gather those information. How do developers judge if they need to select this option or not?
Walter Wu May 6, 2020, 6:23 a.m. UTC | #2
On Wed, 2020-05-06 at 01:53 -0400, Qian Cai wrote:
> 
> > On May 6, 2020, at 1:19 AM, Walter Wu <walter-zh.wu@mediatek.com> wrote:
> > 
> > This patchset improves KASAN reports by making them to have
> > call_rcu() call stack information. It is helpful for programmers
> > to solve use-after-free or double-free memory issue.
> > 
> > The KASAN report was as follows(cleaned up slightly):
> > 
> > BUG: KASAN: use-after-free in kasan_rcu_reclaim+0x58/0x60
> > 
> > Freed by task 0:
> > save_stack+0x24/0x50
> > __kasan_slab_free+0x110/0x178
> > kasan_slab_free+0x10/0x18
> > kfree+0x98/0x270
> > kasan_rcu_reclaim+0x1c/0x60
> > rcu_core+0x8b4/0x10f8
> > rcu_core_si+0xc/0x18
> > efi_header_end+0x238/0xa6c
> > 
> > First call_rcu() call stack:
> > save_stack+0x24/0x50
> > kasan_record_callrcu+0xc8/0xd8
> > call_rcu+0x190/0x580
> > kasan_rcu_uaf+0x1d8/0x278
> > 
> > Last call_rcu() call stack:
> > (stack is not available)
> > 
> > 
> > Add new CONFIG option to record first and last call_rcu() call stack
> > and KASAN report prints two call_rcu() call stack.
> > 
> > This option doesn't increase the cost of memory consumption. It is
> > only suitable for generic KASAN.
> 
> I don’t understand why this needs to be a Kconfig option at all. If call_rcu() stacks are useful in general, then just always gather those information. How do developers judge if they need to select this option or not?

Because we don't want to increase slub meta-data size, so enabling this
option can print call_rcu() stacks, but the in-use slub object doesn't
print free stack. So if have out-of-bound issue, then it will not print
free stack. It is a trade-off, see [1].

[1] https://bugzilla.kernel.org/show_bug.cgi?id=198437

Thanks
Dmitry Vyukov May 6, 2020, 9:37 a.m. UTC | #3
On Wed, May 6, 2020 at 8:23 AM Walter Wu <walter-zh.wu@mediatek.com> wrote:
> > > This patchset improves KASAN reports by making them to have
> > > call_rcu() call stack information. It is helpful for programmers
> > > to solve use-after-free or double-free memory issue.
> > >
> > > The KASAN report was as follows(cleaned up slightly):
> > >
> > > BUG: KASAN: use-after-free in kasan_rcu_reclaim+0x58/0x60
> > >
> > > Freed by task 0:
> > > save_stack+0x24/0x50
> > > __kasan_slab_free+0x110/0x178
> > > kasan_slab_free+0x10/0x18
> > > kfree+0x98/0x270
> > > kasan_rcu_reclaim+0x1c/0x60
> > > rcu_core+0x8b4/0x10f8
> > > rcu_core_si+0xc/0x18
> > > efi_header_end+0x238/0xa6c
> > >
> > > First call_rcu() call stack:
> > > save_stack+0x24/0x50
> > > kasan_record_callrcu+0xc8/0xd8
> > > call_rcu+0x190/0x580
> > > kasan_rcu_uaf+0x1d8/0x278
> > >
> > > Last call_rcu() call stack:
> > > (stack is not available)
> > >
> > >
> > > Add new CONFIG option to record first and last call_rcu() call stack
> > > and KASAN report prints two call_rcu() call stack.
> > >
> > > This option doesn't increase the cost of memory consumption. It is
> > > only suitable for generic KASAN.
> >
> > I don’t understand why this needs to be a Kconfig option at all. If call_rcu() stacks are useful in general, then just always gather those information. How do developers judge if they need to select this option or not?
>
> Because we don't want to increase slub meta-data size, so enabling this
> option can print call_rcu() stacks, but the in-use slub object doesn't
> print free stack. So if have out-of-bound issue, then it will not print
> free stack. It is a trade-off, see [1].
>
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=198437

Hi Walter,

Great you are tackling this!

I have the same general sentiment as Qian. I would enable this
unconditionally because:

1. We still can't get both rcu stack and free stack. I would assume
most kernel testing systems need to enable this (we definitely enable
on syzbot). This means we do not have free stack for allocation
objects in any reports coming from testing systems. Which greatly
diminishes the value of the other mode.

2. Kernel is undertested. Introducing any additional configuration
options is a problem in such context. Chances are that some of the
modes are not working or will break in future.

3. That free stack actually causes lots of confusion and I never found
it useful:
https://bugzilla.kernel.org/show_bug.cgi?id=198425
If it's a very delayed UAF, either one may get another report for the
same bug with not so delayed UAF, or if it's way too delayed, then the
previous free stack is wrong as well.

4. Most users don't care that much about debugging tools to learn
every bit of every debugging tool and spend time fine-tuning it for
their context. Most KASAN users won't even be aware of this choice,
and they will just use whatever is the default.

5. Each configuration option increases implementation complexity.

What would have value is if we figure out how to make both of them
work at the same time without increasing memory consumption. But I
don't see any way to do this.

I propose to make this the only mode. I am sure lots of users will
find this additional stack useful, whereas the free stack is even
frequently confusing.
Walter Wu May 6, 2020, 12:01 p.m. UTC | #4
On Wed, 2020-05-06 at 11:37 +0200, 'Dmitry Vyukov' via kasan-dev wrote:
> On Wed, May 6, 2020 at 8:23 AM Walter Wu <walter-zh.wu@mediatek.com> wrote:
> > > > This patchset improves KASAN reports by making them to have
> > > > call_rcu() call stack information. It is helpful for programmers
> > > > to solve use-after-free or double-free memory issue.
> > > >
> > > > The KASAN report was as follows(cleaned up slightly):
> > > >
> > > > BUG: KASAN: use-after-free in kasan_rcu_reclaim+0x58/0x60
> > > >
> > > > Freed by task 0:
> > > > save_stack+0x24/0x50
> > > > __kasan_slab_free+0x110/0x178
> > > > kasan_slab_free+0x10/0x18
> > > > kfree+0x98/0x270
> > > > kasan_rcu_reclaim+0x1c/0x60
> > > > rcu_core+0x8b4/0x10f8
> > > > rcu_core_si+0xc/0x18
> > > > efi_header_end+0x238/0xa6c
> > > >
> > > > First call_rcu() call stack:
> > > > save_stack+0x24/0x50
> > > > kasan_record_callrcu+0xc8/0xd8
> > > > call_rcu+0x190/0x580
> > > > kasan_rcu_uaf+0x1d8/0x278
> > > >
> > > > Last call_rcu() call stack:
> > > > (stack is not available)
> > > >
> > > >
> > > > Add new CONFIG option to record first and last call_rcu() call stack
> > > > and KASAN report prints two call_rcu() call stack.
> > > >
> > > > This option doesn't increase the cost of memory consumption. It is
> > > > only suitable for generic KASAN.
> > >
> > > I don’t understand why this needs to be a Kconfig option at all. If call_rcu() stacks are useful in general, then just always gather those information. How do developers judge if they need to select this option or not?
> >
> > Because we don't want to increase slub meta-data size, so enabling this
> > option can print call_rcu() stacks, but the in-use slub object doesn't
> > print free stack. So if have out-of-bound issue, then it will not print
> > free stack. It is a trade-off, see [1].
> >
> > [1] https://bugzilla.kernel.org/show_bug.cgi?id=198437
> 
> Hi Walter,
> 
> Great you are tackling this!
> 
> I have the same general sentiment as Qian. I would enable this
> unconditionally because:
> 
> 1. We still can't get both rcu stack and free stack. I would assume
> most kernel testing systems need to enable this (we definitely enable
> on syzbot). This means we do not have free stack for allocation
> objects in any reports coming from testing systems. Which greatly
> diminishes the value of the other mode.
> 
> 2. Kernel is undertested. Introducing any additional configuration
> options is a problem in such context. Chances are that some of the
> modes are not working or will break in future.
> 
> 3. That free stack actually causes lots of confusion and I never found
> it useful:
> https://bugzilla.kernel.org/show_bug.cgi?id=198425
> If it's a very delayed UAF, either one may get another report for the
> same bug with not so delayed UAF, or if it's way too delayed, then the
> previous free stack is wrong as well.
> 
> 4. Most users don't care that much about debugging tools to learn
> every bit of every debugging tool and spend time fine-tuning it for
> their context. Most KASAN users won't even be aware of this choice,
> and they will just use whatever is the default.
> 
> 5. Each configuration option increases implementation complexity.
> 
> What would have value is if we figure out how to make both of them
> work at the same time without increasing memory consumption. But I
> don't see any way to do this.
> 
> I propose to make this the only mode. I am sure lots of users will
> find this additional stack useful, whereas the free stack is even
> frequently confusing.
> 

Ok.
If we want to have a default enabling it, but it should only work in
generic KASAN, because we need to get object status(allocation or
freeing) from shadow memory, tag-based KASAN can't do it. So we should
have a default enabling it in generic KASAN?
Dmitry Vyukov May 6, 2020, 12:16 p.m. UTC | #5
On Wed, May 6, 2020 at 2:01 PM Walter Wu <walter-zh.wu@mediatek.com> wrote:
>
> On Wed, 2020-05-06 at 11:37 +0200, 'Dmitry Vyukov' via kasan-dev wrote:
> > On Wed, May 6, 2020 at 8:23 AM Walter Wu <walter-zh.wu@mediatek.com> wrote:
> > > > > This patchset improves KASAN reports by making them to have
> > > > > call_rcu() call stack information. It is helpful for programmers
> > > > > to solve use-after-free or double-free memory issue.
> > > > >
> > > > > The KASAN report was as follows(cleaned up slightly):
> > > > >
> > > > > BUG: KASAN: use-after-free in kasan_rcu_reclaim+0x58/0x60
> > > > >
> > > > > Freed by task 0:
> > > > > save_stack+0x24/0x50
> > > > > __kasan_slab_free+0x110/0x178
> > > > > kasan_slab_free+0x10/0x18
> > > > > kfree+0x98/0x270
> > > > > kasan_rcu_reclaim+0x1c/0x60
> > > > > rcu_core+0x8b4/0x10f8
> > > > > rcu_core_si+0xc/0x18
> > > > > efi_header_end+0x238/0xa6c
> > > > >
> > > > > First call_rcu() call stack:
> > > > > save_stack+0x24/0x50
> > > > > kasan_record_callrcu+0xc8/0xd8
> > > > > call_rcu+0x190/0x580
> > > > > kasan_rcu_uaf+0x1d8/0x278
> > > > >
> > > > > Last call_rcu() call stack:
> > > > > (stack is not available)
> > > > >
> > > > >
> > > > > Add new CONFIG option to record first and last call_rcu() call stack
> > > > > and KASAN report prints two call_rcu() call stack.
> > > > >
> > > > > This option doesn't increase the cost of memory consumption. It is
> > > > > only suitable for generic KASAN.
> > > >
> > > > I don’t understand why this needs to be a Kconfig option at all. If call_rcu() stacks are useful in general, then just always gather those information. How do developers judge if they need to select this option or not?
> > >
> > > Because we don't want to increase slub meta-data size, so enabling this
> > > option can print call_rcu() stacks, but the in-use slub object doesn't
> > > print free stack. So if have out-of-bound issue, then it will not print
> > > free stack. It is a trade-off, see [1].
> > >
> > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=198437
> >
> > Hi Walter,
> >
> > Great you are tackling this!
> >
> > I have the same general sentiment as Qian. I would enable this
> > unconditionally because:
> >
> > 1. We still can't get both rcu stack and free stack. I would assume
> > most kernel testing systems need to enable this (we definitely enable
> > on syzbot). This means we do not have free stack for allocation
> > objects in any reports coming from testing systems. Which greatly
> > diminishes the value of the other mode.
> >
> > 2. Kernel is undertested. Introducing any additional configuration
> > options is a problem in such context. Chances are that some of the
> > modes are not working or will break in future.
> >
> > 3. That free stack actually causes lots of confusion and I never found
> > it useful:
> > https://bugzilla.kernel.org/show_bug.cgi?id=198425
> > If it's a very delayed UAF, either one may get another report for the
> > same bug with not so delayed UAF, or if it's way too delayed, then the
> > previous free stack is wrong as well.
> >
> > 4. Most users don't care that much about debugging tools to learn
> > every bit of every debugging tool and spend time fine-tuning it for
> > their context. Most KASAN users won't even be aware of this choice,
> > and they will just use whatever is the default.
> >
> > 5. Each configuration option increases implementation complexity.
> >
> > What would have value is if we figure out how to make both of them
> > work at the same time without increasing memory consumption. But I
> > don't see any way to do this.
> >
> > I propose to make this the only mode. I am sure lots of users will
> > find this additional stack useful, whereas the free stack is even
> > frequently confusing.
> >
>
> Ok.
> If we want to have a default enabling it, but it should only work in
> generic KASAN, because we need to get object status(allocation or
> freeing) from shadow memory, tag-based KASAN can't do it. So we should
> have a default enabling it in generic KASAN?

Yes, let's do generic KASAN always memorizes rcu stack; tags KASAN
never memorizes rcu stacks. No new configurations.